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Editor’s Preface 


With the declaration of the Millennium Development Goals in 2000, the global 
fight against poverty has become one of the central concerns in global policy de- 
bates. Poverty alleviation policies aim at reducing extreme poverty, hunger, poor 
health and education outcomes. Despite considerable efforts, many regions and 
countries in the world are still performing very badly concerning these dimen- 
sions of poverty. This may be due mainly to an ineffective targeting of policies 
to address the root causes of poverty. Sustainable policy interventions are in need 
of reliable concepts of poverty and of a thorough understanding of the underlying 
mechanism that lead to such deprivation. 


This dissertation contributes to analyzing unresolved important issues in the 
fight against poverty by proposing and applying specific statistical methodologies 
to analyze the extent of poverty and its underlying factors based on recent house- 
hold surveys in developing countries. In the first essay, Johannes Gräb is con- 
cerned with poverty measurement. In particular, the author elaborates a concept 
for poverty comparisons when the indicator of well being is observed over several 
points in time. Gräb shows that comparisons based upon the stochastic dominance 
methodology are not only robust to any specific poverty index and to any arbitrary 
setting of the poverty line but additionally to any aggregation procedures of indi- 
vidual incomes over time. He illustrates his approach by performing multiperiod 
poverty comparisons for Indonesia and Peru. The results show considerable de- 
pendence of poverty orderings on the aggregation procedures of income over time, 
emphasizing the relevance of the approach. 


In the second essay Grab takes a closer look at the importance of geographic 
factors in explaining observed spatial variation in household income. The author 
suggests a novel methodology to addresses this issue. He builds a multilevel ran- 
dom coefficient model able to decompose the variance in living standards across 
different spatial levels. Such an approach is particularly interesting from a polit- 
ical point of view since it allows effective targeting of spatial units. In the em- 
pirical part, Grab decomposes the sources of spatial disparities in incomes among 


households in Burkina Faso showing that spatial disparities are not only driven by 
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the spatial concentration of households with particular endowments but to a large 
extent also by disparities in community endowments. 

The third essay takes into account the multidimensionality of poverty. House- 
holds in developing countries do not only suffer from low levels of income but 
also from other dimensions of well-being, such as health or education. It is there- 
fore important to analyze the underling mechanism that lead to observed outcomes 
in non-monetary indicators of poverty to deduce effective poverty reduction poli- 
cies. Gräb investigates the role of cultural, geographic, and political factors on the 
relationship of anthropometric outcomes of children and Under-5 mortality rates. 
He focuses on the unique situation of the territory around Lake Victoria which 
shows a pattern of low levels of malnutrition together with dramatically high rates 
of mortality found in no other region in Sub-Saharan Africa. Applying linear and 
nonlinear multilevel regression analysis the author finds a unique interplay of cul- 
tural, geographic and political factors in the Lake Victoria region to be responsible 
for causing the described paradox. 

Johannes Gräb thus addresses a number of highly topical issues discussed in 
the ongoing literature on poverty and inequality in developing countries. He also 
provides very important new insights for our understanding of poverty in its many 
dimensions. Beyond, Gräb narrows the gap between the comprehensive statistical 
toolbox and its still limited application in development economic research. With 
his analysis, Gräb provides a valuable contribution to the economics literature on 
the empirical analysis of poverty in general, and on poverty comparisons, spatial 
income inequality, child mortality and undernutrition in particular. 


Prof. Stephan Klasen, Ph.D. 
Göttingen, June 2009 
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Introduction and Overview 


Following the proclamation of the Millennium Development Goals, the public 
perception of the miserable living conditions of the developing countries’ poor 
people increased considerably. More than a billion people declared to be poor 
cautioned society, political authorities and economic researchers to deal more in- 
tensively with the extreme harm of the poor. Poverty developed to one of the main 
topics of present economic policy debates. Recent World Economic Summits of 
Heiligendamm and Toyako dealt specifically with the economic development of 
Africa and Asia to reduce global poverty and with the political action necessary 
to provide food safety all over the world. Privately organized events like ‘Live 
8° or ‘Stand Against Poverty’ encouraged, globally, thousands of people to take 
action against poverty by jointly pressuring political leaders to increase financial 
aid. Public and private efforts aimed at implementing poverty alleviation poli- 
cies to decrease the amount of people suffering under poor housing conditions, 
inadequate nutritional intake or insufficient education. 

Implementing successful poverty alleviation policies requires essential target- 
ing of poverty causing factors. Therefore policy makers are in need of (1) a con- 
cept of poverty and its classification, (2) a thorough understanding of the under- 
lying mechanism that lead to poverty and (3) empirical methods to analyze (1) 
and (2). This dissertation claims to contribute to the latter by proposing and ap- 
plying specific statistical methodologies to analyze the extent of poverty and its 
underlying factors based on recent household surveys in developing countries. 


Empirical development economics 


The measurement of poverty and the analysis of poverty-causing factors are largely 
based on the application of statistical methods. As in economics generally, re- 
search on poverty focusses particularly on issues, where empirical data is avail- 
able, by formulating theoretical hypotheses — most often on causal relationships — 
which will then be tested by estimation. There is perhaps no other area of science, 
where the application of quantitative methods on statistical data to test theoretical 


assumptions is as prominent as in economics. The prevalent ap lication of math- 
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2 INTRODUCTION 


ematical statistics in empirical economics lead Ragnar Frisch and Joseph Schum- 
peter as early as the 1930’s to constitute the term econometrics and to establish 
the Econometric society. In the editorial of the first issue of Econometrica Frisch 
(1933) noted: “Experience has shown that each of those three viewpoints, that of 
statistics, economic theory, and mathematics, is a necessary, but not by itself a 
sufficient, condition for a real understanding of the quantitative relations in mod- 
ern economic life. It is the unification of all three that is powerful. And it is this 
unification that constitutes econometrics.” 

While much of early development economics was entirely theoretical (see e.g. 
Rosenstein-Rodan (1943); Leibenstein (1957); Sen (1973)), there has been a clear 
shift in the last two decades towards mainstream empirical, i.e. econometrical, 
economics (Ray, 2007). As Mookherjee (2005) notes: “Development economics 
is increasingly becoming an empirical discipline today.” This phenomenon can be 
ascribed to two main trends of recent decades. On the one hand, technical progress 
and the implementation of user friendly statistical software greatly facilitated the 
application of complex and computationally extensive statistical methods. More 
importantly, however, data availability expanded extensively. Since the beginning 
of the 1990s, household survey data, providing requisite information at the indi- 
vidual level, has become available for most developing countries. Application of 
miscellaneous methodologies has since enabled researchers to measure and com- 
pare the extent of poverty, to identify and quantify its driving factors or to evaluate 
competing poverty reduction strategies. 

The present prominence of empirics has recently brought a discussion into de- 
velopment economics that has already had a longstanding tradition in mainstream 
economics. The discussion is about the actual value added of empirical research. 
There are basically two main concerns. (i) The econometric validity of empirical 
results is often, especially in the case of causal regression analysis, disputable. 
The key problem in regression analysis is to infer causality from simple correla- 
tion. Seven decades ago, Keynes (1939) already expressed his concern about the 
usefulness of causal inference based on regression analysis by commenting on the 
“slippery problem of passing from statistical description to inductive generaliza- 
tion in the case of simple correlation”. (ii) The first concern directly provokes 
the second. To circumvent possible econometric biases empirical papers focus 
nowadays on combating the econometric problems. Therefore, authors concen- 
trate on specific phenomenon that may be analyzed, given the data, in a solid way. 
This comes at a cost: generalization of the often microscopic results is seldom 
reasonable. ! 


!For a recent debate on theory versus empirics in development economics see Mookherjee 
(2005) whose article is followed by comments from Bardhan (2005), Basu (2005), Banerjee 
(2005), and Kanbur (2005). For a general critique on the application of econometrics see Hendry 


(1980). 
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These concerns have clearly to be taken into account for further (development) 
economic research. There is, however, no doubt that much progress has already 
been made to circumvent some of the cumbersome problems of econometrics and 
that empirical findings have, in Ragnar Frisch’s sense, indeed provided notable 
contributions for a better understanding of the quantitative relations in modern 
economic life. 

From a development economics perspective, empirical methods have proved 
to be particularly useful in poverty analysis. Research on poverty measurement 
facilitated the comprehension of the extent of poverty and its evolution over time 
and space. In arecent contribution, Ravallion et al. (2008) used regression analysis 
to revise the international *$1 a day” poverty line. Based on new empirical data, 
the authors propose an absolute international poverty line of $1.25. Empirical 
findings like these contribute to poverty reduction by enabling decision makers to 
target specifically those groups of the world population who are in particular need 
of poverty alleviation programmes. 

Such as poverty measurement, causal poverty analysis may contribute sig- 
nificantly to poverty reduction by identifying driving factors of poverty. To cir- 
cumvent the problem of endogeneity described above, recent contributions have 
focussed on the application of instrumental variables in regression analysis and 
the use of randomized controlled trials. Randomized controlled trails (RCT) in- 
volve the random allocation of different interventions (treatments or conditions) to 
subjects to create exogenous benchmark groups. If RCT are not available, econo- 
metric methods may help to create artificial experiments which may then serve 
as benchmarks. Using a RCT, Miguel and Kremer (2004) found that children 
going to schools, where de-worming medicine was distributed, came to school 
more regularly.” In a similar paper Chattopadhyay and Duflo (2004) conclude 
that panchayats? headed by a woman are performing significantly better, e.g. in 
the provisioning of water. Identified driving factors of poverty, like women as 
political leaders or de-worming of children, serve as starting point to deduce ef- 
fective policy interventions. 

The papers of Ravallion et al. (2008), Miguel and Kremer (2004) and Chat- 
topadhyay and Duflo (2004) show two things: (i) the relevance of empirical devel- 
opment economics for effective poverty reduction policies and (ii) the impressive 
progress of empirical analysis since the times of Keynes (1939) or Hendry (1980). 

The enormous potential of statistical analysis in poverty research has, how- 
ever, still not been sufficiently exploited. Quite the contrary, the scope for analyt- 
ical poverty research widens continuously: constant appearance of new compre- 


2In this case, the data was based on a RCT since de-worming medicine was distributed to 
children of randomly assigned schools in Western Kenya. 


3Panchayats are local government bodies at the village level in India. 
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hensive data sets, such as the release of household panel data in more and more 
developing countries, broadens the area of application while rapid progress in nat- 
ural science entails steady development of new methodologies. 

This dissertation is a collection of three independent essays that deal with the 
application of econometric methods, based on household survey data, in the con- 
text of poverty research. All studies consider well-known statistical concepts that 
have yet not been applied to the research questions under consideration. By this 
means, the papers contribute to research in development economics in two ways. 
The application of recent methodologies allows gaining better insight and deriving 
new findings on the measurement of poverty and its underlying factors. Simulta- 
neously, the application is conducive to narrow the gap between the comprehen- 
sive statistical toolbox and its still limited application in development economic 
research. 

The first chapter deals with the concept of poverty comparisons when the well- 
being indicator, income, is observed over consecutive periods. The second chap- 
ter studies the determinants of spatial inequality in household income in Burkina 
Faso, by decomposing overall inequality in inequality within and between nested 
spatial levels. The third chapter analyzes the relation between a child’s nutritional 
status, derived from its stunting and wasting z-score, and its survival probability, 
to resolve the paradox of high mortality but low malnutrition rates in the Lake 
Victoria region of Kenya. 


Three essays in empirical development economics 
A concept for multiperiod poverty comparisons 


Research on poverty measurement is closely connected to the seminal work of 
Sen (1976). Sen distinguishes between two fundamental issues: identifying the 
poor within the population by setting a poverty line and constructing a poverty 
index to measure the extent of deprivation. Based on an axiomatic approach Sen 
constructs a poverty measure capable of performing ordinal welfare comparisons. 

Following Sen, research on poverty measurement evolved into two strands 
of literature: (i) the construction of poverty indices to measure poverty and (11) 
the generation of poverty orderings to compare poverty. The first strand, poverty 
measurement, deals with the attempt to construct summary poverty indices that 
capture several concepts of poverty and satisfy various poverty axioms (see e.g. 
Foster et al. (1984); Atkinson (1987); Zheng (1997)). Beyond the poverty head 
count, to measure the fraction of people below the poverty line, the concepts in- 
clude, among others, the poverty gap to capture the average extent of individual 
poverty and the squared poverty gap to measure the inequality among the poor. 


Among the numerous proposed poverty axioms, researchers agreed on a core set 
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of axioms each poverty measure should satisfy: focussed, continuous, monotonic 
and distribution sensitive.* 


The various proposed measures impose implicitly the need for the second 
strand of the literature. Any choice of poverty measure is arbitrary and may lead 
to different outcomes. The second strand, poverty orderings, approaches exactly 
this arbitrariness by proposing methodologies that yield rankings of poverty which 
are robust to alternative poverty measures. Another arbitrariness in poverty mea- 
surement results from the setting of the poverty line. Poverty line construction is 
usually based on minimum nutritional intake. Since there is no exact level of food 
intake requirements different reasonable poverty lines are supposable (Atkinson, 
1983). The literature on poverty orderings examines the rankings of distributions 
of one or more indicators of well-being to yield poverty comparisons which are 
robust to a wide range of poverty measures and poverty lines (see e.g. Atkin- 
son and Bourguignon (1982); Atkinson (1987); Foster and Shorrocks (1988a,b,c); 
Duclos et al. (2006b)). 


Essay 1 follows the second strand of literature by elaborating a concept for 
multiperiod poverty comparisons. Above the choice of a suitable poverty mea- 
sure and poverty line, the paper deals with the question how poverty can be mea- 
sured and compared when the indicator of well being is observed over several 
points in time. Specifically, Essay 1 proposes a concept to compare an individ- 
ual’s well-being over consecutive periods as well as to compare well-being of two 
different individuals observed at two concurrent periods. The proposed method- 
ology allows for multiperiod poverty comparisons that are robust to any specific 
poverty index, to any arbitrary setting of the poverty line and to any aggregation 
procedures of individual incomes over time. The elaborated concept, which is, 
following Atkinson (1987) and Duclos et al. (2006b), based upon the stochastic 
dominance methodology, is illustrated by performing multiperiod poverty com- 
parisons for Indonesia and Peru. Showing considerable dependence of poverty 
orderings on the aggregation procedures of income over time, the results empha- 
size the relevance of the approach. 


Econometric analysis of spatial inequality 


Causal poverty analysis, based on regression models, allows identifying driving 
factors of households’ living standards. A peculiar discussion has taken place in 
development economics about the importance of geographic factors in explaining 
observed spatial variation in household income (see e.g. Ravallion and Wodon 
(1999); Jalan and Ravallion (2002); De Vreyer et al. (2009); Grimm and Klasen 


^For a discussion of relevant axioms see u: (2000). y 
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(2008))? Several developing economies show areas that are persistently poor. 
Two divergent views exist in explaining why some regions perform better than 
other regions within the same country. Areas could be poor due to a spatial con- 
centration of households with similar, poor, characteristics. According to this 
view, geographic endowments do not play a role in determining households’ in- 
come. On the other hand, geographic capital might be correlated with living stan- 
dards of different regions. Differences in area-specific factors, like climate or 
altitude in terms of pure geographic factors, or infrastructure in terms of area en- 
dowments, may directly have a causal role in determining households’ welfare. 


Using different regression techniques Ravallion and Wodon (1999), Jalan and 
Ravallion (2002) and Benson et al. (2005) analyze if differences in households’ 
living standards across spatial entities within a country are entirely accountable 
to a spatial segregation of people with similar endowments, or to geography per 
se. All studies conclude that it is not solely a spatial correlation of differences in 
mobile non geographic characteristics that makes areas poor. Specific factors of 
a households area of residence matter by restraining households income growth 
and by altering returns to private endowments. 


While all these studies suggest that poverty reduction efforts have to be tar- 
geted at the sub-national level, they do not provide a decomposition of the vari- 
ance in living standards observed within and between nested spatial units. Conse- 
quently, the studies cannot weight the influence of the different spatial units on the 
variance in income levels. Essay 2 suggests a novel methodology to address this 
issue by building a multilevel random coefficient model able to decompose the 
variance in living standards across four spatial levels; households, communities, 
provinces and (agro-climatic) regions. Knowledge of the relevance of each spatial 
level for household income generation is particularly important from a political 
point of view: Since there may be constraints on the ability to target household 
characteristics, targeting spatial units effectively seems crucial. 


Based on the proposed multilevel modeling approach, Essay 2 decomposes the 
sources of spatial disparities in incomes among households in Burkina Faso. The 
results show that spatial disparities are not only driven by the spatial concentration 
of households with particular endowments but to a large extent also by disparities 
in community endowments. Climatic differences across regions do also matter, 
but to a much smaller extent. 


>For a discussion on the importance of geographic factors in a cross country setting see e.g. 


Acemoglu et al. (2001); Hall and Jones (1999); Gallup et al. (1998). 
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Empirical analysis of child mortality and undernutrition 


Essay 1 and Essay 2 focus on monetary poverty, i.e. low levels of income and 
consumption, respectively. Households in developing countries also suffer from 
other dimensions of poverty. While poverty has initially been regarded as a mon- 
etary phenomenon, its multidimensionality is now widely accepted (Sen, 1987; 
Strauss and Thomas, 1998). Most of the concepts for the measurement and analy- 
sis of poverty have therefore been developed in a way that they are also applicable 
to non-monetary indicators of well-being, such as the health, nutritional or educa- 
tional status of an individual (see e.g. Bourguignon and Chakravarty (2003); Duc- 
los et al. (2006b). Deprivation in the multiple dimensions of poverty should hence 
be taken into account when measuring poverty, and the underling mechanism that 
lead to observed outcomes should be analyzed to deduce effective poverty reduc- 
tion policies. 

Two of the still most challenging problems in the fight against poverty are the 
prevalent high rates of undernutrition and child mortality. One of the major causes 
of child mortality is thought to be undernutrition itself. Pelletier et al. (1995) 
claim that undernutrition is the underlying cause of more than 50% of all child 
deaths in the world. The close relationship between a child’s nutritional status 
and its survival probability is challenged when nutrition and mortality outcomes 
are analyzed in the Lake Victoria Region of Kenya. Essay 3 shows that there is no 
other region in Sub-Saharan Africa where the pattern of low levels of malnutrition 
together with dramatically high rates of mortality is as pronounced as around Lake 
Victoria. 

Essay 3 investigates the role of cultural, geographic, and political factors on 
the relationship of anthropometric outcomes of children and Under-5 mortality 
rates in Kenya with an explicit focus on the unique situation of the territory around 
Lake Victoria. Based on linear and nonlinear multilevel regression analysis to con- 
trol for unobserved household and community characteristics the driving factors 
of mortality, stunting, and wasting are analyzed jointly. 

The findings point to a unique interplay of cultural, geographic and political 
factors in the Lake Victoria region which are responsible for causing the described 
paradox. The results do not only demonstrate the relevance of considering and 
understanding the country specific context when analyzing child health outcomes 
but also that the common practice to make inferences about health status based on 
anthropometric outcomes has to be done with strong caution and can easily lead 
to erroneous results. 
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Essay 1 


Robust Multiperiod Poverty 
Comparisons 


Abstract: We propose a methodology for comparing poverty over multiple peri- 
ods across time and space that does not arbitrarily aggregate income over various 
years or rely on arbitrarily specified poverty lines or poverty indices. We use 
multivariate stochastic dominance tests to create dominance surfaces for different 
time spans. We elaborate the method first for the bidimensional case, using as 
dimensions income observed over two periods: one at the beginning and one at 
the end of a time span. Subsequently, we extend it to the case where incomes 
are observed over n-periods. We illustrate our approach by performing poverty 
comparisons using data for Indonesia and Peru. 


based on joint work with Michael Grimm. 
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1.1 Introduction 


Today it is well established that poverty is adynamic phenomenon. But if poverty 
does fluctuate and evolve over time, this raises the question of how best to measure 
it over multiple periods. Cross-sectional poverty measures can provide abundant 
information on the extent of poverty at a given point, but almost none on the rate 
at which people escape from or fall into poverty over time. 

Recognizing this, authors such as Grootaert and Kanbur (1995) have suggested 
focusing on households’ changes in poverty status. Others have developed con- 
cepts to aggregate incomes over multiple periods (1.е., trajectories of income over 
time) using an evaluation function that explicitly captures, for example, the risk 
aversion of households (see e.g., Cruces (2005)). While such an approach has the 
advantage of accounting for the negative effects of income variability on the well- 
being of households, it requires arbitrary assumptions about how exactly ‘risk- 
adjusted mean income’ is best computed. 

Likewise, considering the standard spells and component approaches proposed 
for measuring and conceptualizing chronic and transient poverty, one can safely 
state that the results and consequently the policy implications depend heavily on 
how the two forms of poverty are measured: how incomes are aggregated over 
time, how the poverty line is set, and what poverty index is chosen (see, e.g., 
Hulme and McKay (2005); Jalan and Ravallion (1998); Duclos et al. (2006a)). 
Both the spells and component approach usually rely on one specific poverty line 
and one specific poverty function. Moreover, approaches based on the compo- 
nents approach are usually based on some calculation of average income over time 
and thus abstract from the exact pattern of the income trajectory. In other words, 
three consecutive years of high income followed by three consecutive years of low 
income are treated as six years over which a year of high income follows a year 
of low income and so on. 

To circumvent these problems, we suggest another approach for multiperiod 
poverty measurement based on stochastic dominance tests. This enables us to 
establish poverty orderings that are valid for a wide range of aggregation rules of 
incomes observed over time, a wide range of poverty indices, and a wide range of 
poverty lines. Our approach relies on the literature on multi-dimensional poverty 
orderings Duclos et al. (2006b), in which dimensions refer to various indicators 
of individual well-being such as income, education and health.! Our dimensions 
are incomes observed at different points in time. Defining dimensions in this way 
raises some further challenges, which we discuss below. We develop our approach 
first for the case where incomes are observed over two periods and then extend it 


'See also Duclos et al. (2006c) and the seminal papers by Bourguignon and Chakravarty 


(2002, 2003) 
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to the case where incomes are observed over n-periods. We illustrate this approach 
using longitudinal data for Indonesia and Peru. Note that we do not address the 
issue of income uncertainty and disutility due to income volatility. 

Among the papers dealing with multiperiod poverty, probably Hoy and Zheng 
(2007), Foster (2007) and Bossert et al. (2008) are the closest related to ours. 

Hoy and Zheng (2007) suggested a lifetime poverty measure derived from an 
axiomatic approach. If we computed the poverty measure we suggest for a lifetime 
period instead for sub-periods of total lifetime, we would be able to derive similar 
results. However, we do not explore the implications of various axioms one may 
wish to impose on such a lifetime poverty measure. Moreover, whereas Hoy and 
Zheng’s approach is designed to compare lifetime poverty across different groups 
of individuals, our approach is intended to do both, either compare multiperiod 
poverty across different sub-periods of total lifetime for a given group of indi- 
viduals or to compare multiperiod poverty for a given sub-period across different 
groups of individuals. 

In the spirit of Hoy and Zheng (2007), Bossert et al. (2008) have also con- 
structed a lifetime poverty measure. Their approach differs in the properties that 
are deemed relevant. Bossert et al. (2008) model explicitly the persistence of 
poverty over time by relaxing the notion of path independence considered by Hoy 
and Zheng (2007). Their index regards the negative effects of being in poverty 
as cumulative, in the sense that a two-period poverty spell is worse than two one- 
period spells interrupted by a period out of poverty. 

Foster (2007) suggested a new family of chronic poverty measures based on 
the well-known Foster-Greer-Thorbecke measures (1984). Foster (2007) identi- 
fies the chronically poor using two cutoffs: a standard poverty line, which identi- 
fies the time periods during which a person is poor, and a duration cutoff, which 
is the minimum percentage of time a person must be in poverty in order to be 
chronically poor. 

The remainder of our paper is organized as follows. In Section 2 we present 
our methodology. In Section 3 we implement our methodology empirically and 
analyze multiperiod poverty in Indonesia and Peru. In Section 4 we discuss our 
results and conclude. 


1.2 Methodology 


1.2.1 Stochastic dominance in a one-period welfare measure 


We assume that individual well-being, A, is a function of y, a well-being indicator, 
for instance income received in period t. Let у be defined over the interval [0, з], 


where the set of distributions of well-being indicators i is T! Е: 10 ids = [0,1]. 
опаппез Gra 
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We assume a non-decreasing well-being function without imposing anything con- 
cerning the exact contribution of y to well-being: 


9% (у) 
д(у) 


An individual is assumed to be poor if well-being А (у) is below a poverty 
frontier, A(z), where z is the poverty line belonging to the well-being indicator. 
The poverty set can then be defined as: 


A(A) = {yl O) <А(0) }, (1.2) 


À (y), where 


> 0. (1.1) 


with A(z) = 0. 

In what follows we consider, following Atkinson (1987), all additively sep- 
arable poverty measures P that are non decreasing in A(y) and anonymous. We 
denote this set of poverty measures =. Our poverty measure can be computed by: 


PEA) = |, „PRO G)dFO) 13) 


If well-being is only measured along опе dimension, e.g. the one period case, 
equation 1.3 can be rewritten as: 


POF) = | роудаРо). (14) 


Our set of poverty measures, =, includes, for instance, the Watts measure of 
poverty (Watts, 1968), where p(y, 2) = (Inz — Шу), and all poverty measures within 
the Foster-Greer-Thorbecke family, Po, (Foster et al., 1984) with œ > 0 (Foster and 
Shorrocks, 1988b,c), where p(y,z) = (1 — y/z)*.? 

Tests of stochastic dominance are today widely used to establish poverty or- 
derings D that are robust for a broad class of poverty measures, P(F;z), and a 
large range of poverty lines, z € [0, о]. 

Given two distributions F € V and С Е Ч, the first order stochastic dominance 
condition (FSD), D4, states: 


FDıGVPe3ı,ze [0,2] e  rF(z—-G(z)«0vzej[oz"" (1.5) 


where FD,G means that F has unambiguously less poverty than С with respect to 
all poverty indices belonging to the class €, and all poverty lines within the range 
[0,2]. 


The Foster-Greer-Thorbecke poverty measure has the formula Ру = 1/NY7 (1 — е, 
where N is the total number of individuals i= 1,..., N. The parameter & > 0 is a poverty aversion 
parameter: œ = 0 yields the poverty headcount index, œ = 1 the poverty gap index, and а = 2 


poverty severity index (Foster et al., 1984). 
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For FSD orderings it is sufficient to compare the distribution function of the 
well-being indicator in period 1, F(y;), with its analog in period 2, G(y2). The 
distribution function can also be called dominance curve. If first order stochastic 
dominance does not hold, higher-order stochastic dominance tests can be applied 
to generate robust poverty orderings. Higher-order dominance requires to add 
further assumptions on how the function p(y,z) evolves with y. For instance, 
second order stochastic dominance (SSD) requires to specify p(y,z) in a way that 
P satisfies the Pigou-Dalton transfer principle (see e.g. Foster and Jin (1996)). The 
Pigou-Dalton transfer principle states that a transfer of income from a richer to a 
poorer person will not increase poverty as long as that transfer does not reverse 
the ranking of the two. In this case, the areas under the distribution functions can 
be compared to generate poverty orderings. If we denote the set of all Daltonian 
poverty measures, =>, then SSD, D», states: 


< < 
FD2G YP € 22,26 [0,27] == | F(y)dy -/ G(y)dy <OVze [0,2] 


(1.6) 
where FD2G means that P(F) has unambiguously less poverty than P(G) with 
respect to all poverty indices belonging to the class 2 and all poverty lines 
within the range |0,z"**]. For instance, within the ЕСТ poverty measure family 
the poverty gap (о = 1) satisfies the Pigou-Dalton transfer principle, the poverty 
headcount (œ = 0) does not. 

If second order dominance does also not hold, it is possible to integrate the 
distribution function again and to test for third order dominance. This would of 
course further limit the set of applicable poverty measures by imposing even more 
restrictive axioms. Therefore, in the theoretical part of our paper, we restrict our 
analysis to FSD and SSD. In the empirical part we consider only FSD. 

Note also that we do not consider weak stochastic dominance, because statis- 
tically it is impossible to distinguish weak and strong stochastic dominance.? 

It is widely acknowledged that the concept of poverty dominance is useful 
because it circumvents the problem of choosing one particular poverty measure 
and one specific poverty line. In the following, we extend the concept, first to 
two-period welfare measures and then to n-period welfare measures. 


1.2.2 Stochastic dominance in a two-period welfare measure 


To take into account the dynamic aspects of poverty, we now extend the one-period 
well-being function to a two-period well-being function, where the arguments are 


3Weak stochastic dominance requires F(z) — G(z) < 0 for all poverty measures. Thus strict 


stochastic dominance, as defined in equation 1.5, implies weak stochastic dominance 
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(y1,y2), e.g. income received in periods 1 and 2. The well-being function can then 
be written as: 


9À(y1.»2) > 0, 9À(yi.y2) > 0. (1.7) 
9(y1) 9 (y2) 

Hence, we impose the condition that the well-being function, A, is differen- 
tiable with respect to the welfare measure іп г = 1 and t = 2 and that income in 
both periods contributes positively to individual well-being. Yet, as before, we 
impose nothing regarding the precise value of the contribution of each year to 
individual well-being. 

We define an individual to be poor if his or her overall well-being A (у, уг) is 
below the unknown poverty frontier. In the two-period case, the poverty frontier 
is not a single point, z, but a locus of points. We define this locus as A (y1,y2) = 0. 
The overall set of poor people is defined as: 


A(A) = {51,521 (1,72) € 0). (1.8) 


Depending on the specific definition of the locus of the poverty frontier, multi- 
period poverty comparisons can be performed according to the ‘intersection’ and 
the ‘union’ poverty definition (Duclos et al., 2006b). Intersection poverty means in 
our case that someone is considered poor if well-being is below the poverty thresh- 
old in both periods. The concept of ‘intersection’ multiperiod poverty is therefore 
closely related to the concept of chronic poverty (see e.g. Hulme and Shepherd 
(2003)). Intersection poverty is represented in figure 1.1 by the crossbred-shaded 
area under the function A1(y1, y2) (dashed line). Union poverty means that some- 
one is considered poor if well-being is below the poverty threshold in one of the 
two periods. This is represented in figure 1.1 by the entire shaded area under the 
function A2(yı,y2) (dotted line). In the empirical part of our paper we emphasize 
the parallels with the concept of chronic poverty and thus focus on intersection 
poverty. 

As in the one-period case, we consider all additively separable, non decreasing 
and anonymous poverty measures P. However, we add a further restriction. We 
require y; and y» to be substitutes in Л (ут, y2).* This assumption implies that 
an increase of the well-being indicator in one period increases well-being more 
the lower the well-being indicator in the other period. Hence, our concept of 
multiperiod poverty accounts for the correlation between individuals’ outcomes 
across both periods. We denote this set of poverty measures E; ;. Transferring 
equation 1.4 to the two-period case, the poverty measure reads: 


1 (1,52): R? — R| 


PA=] | р(А (ут, y2) А (21,22) ЈАР (71, У2). (1.9) 
A(A (z1,22)) 

eee ; ‚ 9A? 03) 
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Figure 1.1: Test Domain for Dynamic Poverty Comparisons 


У. N 


Bisector: Z1=Z2 


Equation 1.9 holds for multiperiod poverty comparisons according to the in- 
tersection as well as according to the union definition of poverty, depending on the 
locus of A(A). If we focus on intersection poverty, as we will do in the empirical 
part, equation 1.9 could be rewritten as: 


21 22 
P(Fizi,z0) = | | Р(У1,У2, 21,2 ЈАР (y1,y2). (1.10) 


Obviously, as for usual period-by-period poverty orderings, it is desirable that 
poverty orderings over multiple time spans, Tj, are robust to a large set of poverty 
lines z € Z. This can be ensured by simply transferring the concept of stochas- 
tic dominance for univariate welfare distributions to the case of bivariate wel- 
fare distributions. A comparison of two time spans is denoted in what follows as 
Та = [па Бај vs. Tp = |ty3t2»], where t now has an index for the period within 
each time span, year | or year 2, and an index for the time span, time span a or 
time span b. 

Furthermore, poverty orderings in the bivariate case, i.e. across time spans, 
should be robust to a broad range of procedures to aggregate the observed period- 
specific well-being indicators over the two periods constituting a time span. Thus, 
the weight given to each single period should not matter, i.e. whether we weigh 


each period equally or whether we give to one period a higher weight than to the 
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other. A reason for doing the latter might be to account for time preference, i.e. 
one weighs income today more than income tomorrow. Hence, we require that our 
ordering is robust to the magnitude and even the sign of the time discount rate.” 


Given two distributions F (ута, Уа) € V and С(уь,уь) € Y the first order 
stochastic dominance condition, D, 1, states: 


FD О07УРЕ 1 3,21 € [0,277], z2 € [0,2] 


<> F(z) -G(z) «0V zi € [0,21], 22 € [0,5], (1.11) 


where FD1 1С means that multiperiod poverty is lower over time span T; than 
over time span T, with respect to all poverty indices belonging to the class E; 1 
and all poverty within the range [0, z7'^*] and [0, 27). 

As in the one-period case, tests of higher order dominance could be equally 
well established by imposing further assumptions regarding the effect of y on 
p(y1,y2,21,22). For instance, holding constant the distribution in period 2, we 
could impose that a transfer from a richer to a poorer person in period 1 reduces 
poverty. Symmetrically, we would then impose the same transfer sensitivity on 
period 2. 


As mentioned above, we also require our concept to be robust to a broad range 
of procedures for aggregating the observed period-specific well-being indicators 
over the two periods constituting a time span. The simplest way to deviate from 
an aggregation where each period receives the same weight is to vary the poverty 
lines within time spans, since this varies the income necessary to be beyond the 
period-specific poverty frontier in each period. If we chose 21 Æ 22 s.t. са = Zip 
and 22а = Zp, 1.e., to give a different weight to the first and second period each time 
span, the test domain for intersection poverty dominance represents a rectangle, 
where y, < zı and y? < zo. This is illustrated the dashed line in Figure 1.1. In 
what follows, the aggregation procedure is incorporated through the definition of 
the poverty lines. 


In our methodology, and in contrast to *one-period-stochastic-dominance', 
F (уп, уг) refers now to a bivariate distribution. Hence, the test of stochastic dom- 
inance does not imply comparing two curves, as with one-period well-being mea- 
sures, but two surfaces, where each surface is characterized by its two periods — 
the well-being measure in the first and second period – and the cumulative density 
at each point of that surface. Rewriting equation 1.10 shows that the dominance 


За a similar way one could account for uncertainty regarding the right way to deflate incomes 


from one period to the next. 
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surface is the product of the two unidimensional curves plus the covariance in the 
poverty indices in the two dimensions (Duclos et al., 2006b): 


51 22 
P(F321,22) = р(у\,а)4Е (ул) | P(y2,22)dF (ya) +cov[(p(y1,21)), (р(у2,22))]. 
(1.12) 

The higher the correlation of individuals’ incomes the ‘higher’ the dominance 
surface. Our multiperiod poverty index therefore implicitly judges a situation in 
which one individual is always poor and one always rich worse, ceteris paribus, 
than a situation where two individuals are poor in one period and rich in the other 
period. A further comment regarding the robustness to the aggregation procedure 
is in order. In fact, they way we deal with this problem implies that there is 
one special situation in which robustness to the aggregation procedure cannot be 
tested. This arises when the time spans under consideration overlap, i.e., when the 
second period of the first time span simultaneously represents the first period of 
the second time span. For instance, if poverty over the time span 1980-1990 has 
to be compared with poverty over the time span 1990-2000, i.e. узи = ур. In this 
case, the same weight has to be assigned to each. 

In this special case the dominance criteria simplifies to: 


FDiGV РЕ 1,2 [0,27] > Р(2,2) -G(z,2) < 0Y z € [0,277] (1.13) 


where ЕЮ 1G means that multiperiod poverty is lower time span Tą than over 
time span T, with respect to all poverty indices belonging to the class 211 and 
all poverty lines within the range [0, 27], Note that we now only test dominance 
between the two surfaces along an expansion path of z, where у < z and y? < z 
(see the bisector line in figure 1.1). 

The ‘overlap’ problem can obviously not occur with comparisons over space, 
say, if poverty over the timespan 1990-2000 in country a is compared to poverty 
over the timespan 1990-2000 in country b. In this case nothing prevents us to 
choose zı Æ 22 S.t. Ziq = 2 and 22а = Zap, i.e. to give a different weight to the 
first and second period within each time span in each country. 


1.2.3 Stochastic dominance in a n-period welfare measure 


Extending our methodology to the n-period case is straightforward. Our well- 
being measure becomes A (у1, у2,... ул). The well-being measure is differentiable 
with respect to each single period income y;, where 0A (y1,y2,---,¥n)/Oyi > 0. 
The poverty locus becomes a n-dimensional space. 

Given two distributions F (ута, Уза, --. Уна) € V and С(уь,уь,:::,у) € У 
the first order stochastic dominance condition, D 1... 1, states: 


54,9 


GVPEE ‚zı Є [0,2] “],2 є [0,2 7],...,2, € [0,2 
| ма Є [ба reach! 


Downloaded from PubFactory at 01/11/2019 02:31:01АМ 
via free access 


18 1. ROBUST MULTIPERIOD POVERTY COMPARISONS 


«= F(21,223...,2n) — G(21,223..-,2n) < 0 
Vzi € [0,27], 22 € (0, 25°" ],...,2zn € [0,20 | (1.14) 


where FD; 1.160 means that multiperiod poverty is lower over time span T; than 
over time span T, with respect to all poverty indices belonging to the class 211.1 
and all poverty lines within the range 21 € [0,217 7], z2 € [0,2914], ....2n € [0,27]. 

Of course the n-dimensional case allows us again to be robust with respect to 
the aggregation procedure by giving a different weight to the n periods within each 
time span, i.e. by choosing 21 Z z2,...,Zg.1 # Zn S-t. Zla = 216 22a = 22, ++ Zna = 
Znb- F'(y1,¥2,---;¥n) now refers to а n-variate distribution and, hence, the test 
of stochastic dominance now implies comparing two hypersurfaces, where each 
hypersurface is characterized by its n dimensions — the welfare measure observed 
over the и periods — and the cumulative density at each point of that hypersurface. 

An additional issue that arises in the n period case is how exactly the two 
time spans are compared. Theoretically, one can compare time spans built using 
different sets of periods as long as each time span has the same number of periods 
and as long as the beginning and the end of the first time span each precede the 
beginning and the end of the second time span respectively. One can then even 
test for dominance over all these comparisons. Below we illustrate such a case 
using time spans of a maximum length of four years. 


1.2.4 Relative poverty comparison 


So far we have proposed the methodology of multiperiod poverty comparison for 
the concept of absolute poverty. Absolute poverty measures deal with income 
mobility; they consider an absolute poverty frontier and keep track of people who 
either stay below or cross this fixed frontier. However, the methodology of mul- 
tiperiod poverty comparisons is equally well applicable to the concept of relative 
poverty. Relative poverty measures take into account social mobility; while still 
keeping track of people who either stay below or cross the poverty line, this fron- 
tier becomes endogenous, for example, expressed as a ratio of the median income. 
Embedding our concept of multiperiod poverty in the concept of relative poverty 
has some common features with the concept of 'social exclusion' as formulated 
by Bossert et al. (2007). 


1.2.5 Estimation and inference 


To establish first order stochastic dominance empirically, it is sufficient – as shown 
by Duclos et al. (2006b) — to calculate the differences of F (ута Уаз: · -» Упа) and 
С(уть, уь, - · - У) оп a sufficiently narrow grid of test points and to test the sta- 


tistical significance of these differences based on student t-tests (where^ refers to 
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estimated values). The relevant test domain changes based on the definition — 
union or intersection — of poverty. 


1.2.6 Bounds to multidimensional dominance 


When applying the methodology presented above, one needs to define a maxi- 
mum poverty set A*(z1,22,...,Zn € Z). Obviously, defining that frontier is always 
arbitrary. We again follow Duclos et al. (2006b) and estimate that frontier directly 
from our sample as the maximum A * for which multiperiod poverty dominance 
holds. Then we can locate within A * all possible poverty frontiers for which there 
is necessarily more poverty in time span a than in time span b. We then can judge 
on a case-by-case basis whether these critical sets and frontiers are wide enough 
to justify the conclusion on poverty dominance. 


1.3 Empirical illustration 


1.3.1 Data 


To illustrate the methodology presented above, we use longitudinal data for In- 
donesia and Peru. 

For Indonesia, we use all three existing waves of the Indonesian Family Life 
Survey conducted by RAND, the University of California Los Angeles, the Uni- 
versity of Indonesia's Demographic Institute and the Center for Population and 
Policy Studies of the University of Gadjah Mada in 1993 (IFLS1), 1997 (IFLS2) 
and 2000 (IFLS3). The IFLS is representative of 83% of the Indonesian popula- 
tion living in 13 of the (at that time) nation's 26 provinces. The IFLS is judged 
as having a very high quality, among other things, because individuals who have 
moved are tracked to their new location and, where possible, interviewed there 
(for details see Strauss et al. (2004)). Using the three waves, we built two pan- 
els one from 1993 to 1997 and one from 1997 to 2000, each comprising roughly 
32,000 individuals living in 7,000 households. We use real household expenditure 
per capita as the welfare measure, but refer to it as income in the following. Ex- 
penditure is expressed in 1993 prices and adjusted by regional price deflators to 
the Jakarta price level. 

For Peru we use six waves (1997-2002) of the yearly Peruvian Encuesta Na- 
cional de Hogares conducted by the Instituto Nacional de Estadística e Infor- 
mática. The ENAHO is representative for the three rural and four urban areas 
of Peru. The 'panel-households' are only a sub-sample of all households inter- 
viewed. Each year, some households drop out of the panel and others are added 


(rotating panel). We construct several year-to-year panels each containing, with 
Johannes Grab - 978-3-653-00480- 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


20 1. ROBUST MULTIPERIOD POVERTY COMPARISONS 


a few exceptions, more than 5,000 individuals living in more than 1,000 house- 
holds. We use again real household expenditure per capita as the income measure. 
Expenditure is expressed in 2002 prices and adjusted by regional price deflators 
to the Lima price level. 

To make income comparable between Indonesia and Peru we convert local 
currencies to international USD. Purchasing Power Parities (PPP) were taken from 
the Penn World Table 6.1 (see Heston et al. (2002)). 


1.3.2 Robust multiperiod poverty comparisons for the two- pe- 
riod case 


In the following we first show empirically how to test for robustness to poverty 
lines. In this case the arbitrary poverty line is assumed to be constant across the 
n periods. We then show how to test for robustness to the aggregation procedure 
by using different poverty lines across periods. To keep the exposition simple 
and short the empirical illustration will primarily focus on first order stochastic 
dominance tests using the intersection definition of poverty. 


Robustness to poverty lines 


To analyze the robustness to the poverty line we use three waves of the Peruvian 
household panel data and consider the time spans 1998 to 1999 and 1999 to 2000. 
According to equation 1.13, for order stochastic poverty comparisons can be made 
by testing for significant differences between the dominance surface of 1998/99 
and the dominance surface of 1999/2000. Testing robustness to the poverty line 
implies testing all points on the bisector between income in period 1 and income 
in period 2. Figure 1.2 shows the dominance surface of the first time span 1998- 
1999. The x and y axes measure income (or more precisely household expenditure 
per capita per day) at the beginning (1998) and the end (1999) of the time span. 
Expenditures are expressed in 2002 US$ PPP equivalents. The third axis measures 
the cumulative share of individuals who are below the points defined in the (x, y) 
domain. 

Figure 1.3 shows the difference between the dominance surfaces of the time 
spans 1999/98 and 1999/2000. The relevant points can be found on the bisector of 
the graph, since we are testing only robustness to the poverty line (i.e., 21 = 22). 
The figure shows that for very low incomes, multiperiod poverty was higher in 
the first than in the second time span for all poverty indices belonging to the class 
21,1. However, as we increase the poverty line, we find that the cumulative share 
of people having had an income below that poverty line increases faster and that 


multiperiod poverty becomes higher for the second time span, This is a very 
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Figure 1.2: Poverty in Peru: Dominance Surface of the Time Span 1998/99 


Cumulative distribution 


6 7 в 9 HH income p.c. 1998 


HH income p.c. 1999 


Income: Household income per capita per day in PPP US$ 
Source: Authors’ calculations based on ENAHO 


interesting result because it highlights the importance of conducting dominance 
tests in this context. It can be seen even more clearly in Table 1.1. 

The vertical axis in Table 1.1 shows income at the beginning of the time spans 
and the horizontal axis at the end of the time spans. The value ‘1’ indicates a 
significant positive difference, i.e., 1999/2000 dominates 1998/99. ‘0’ means an 
insignificant difference, while ‘—1’ indicates a significant negative difference, i.e., 
1998/99 dominates 1999/2000. Actually, we should check for poverty dominance 
at every possible point on this bisector, i.e. at every possible poverty line (e.g., 
$1, $1.01, $1.02, etc.). However, to keep the presentation simple and transparent, 
we abstained from such a detailed analysis and report results only at all poverty 
lines that are multiples of $0.5. Again, the table demonstrates the relevance of our 
approach. Relying on the $1 poverty line, one can conclude that ‘chronic’ poverty, 
i.e. individuals who are under the poverty line in both periods constituting a time 
span, would have fallen from the first to the second time span because there were 
more individuals with less than $1 in 1998 and 1999 than in 1999 and 2000. 
However, if we rely on the $2 poverty line, dominance does not hold anymore 
given the insignificant differences between the surfaces. Finally, if we rely on 
the $3 poverty line, one can conclude that chronic poverty has risen from the 
first to the second time span. Thus, any conclusion about poverty orderings relies 


heavily on the poverty line chosen. In other words, to state that ‘chronic’ poverty 
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Figure 1.3: Poverty in Peru: Difference in Dominance Surfaces (1998/99 - 
1999/2000) 


МУ 


Differences їп dominance surfaces 


0 1 HH income p.c. period 2 


HH income p.c. period 1 


Income: Household income per capita per day in PPP US$ 
Source: Authors’ calculations based on ENAHO 


Table 1.1: Poverty in Peru — Difference in Dominance Surfaces (1998/99 - 
1999/2000) 


Income Income period 2 
period 1 
1.0 15 20 25 30 35 40 45 5.0 
1.0 1 
1.5 0 
2.0 0 
2.5 0 
3.0 -1 
3.5 -1 
4.0 -1 
4.5 -1 
5.0 -1 


Income: Household income per capita per day in PPP US$; 1 indicates that the 1998/99 surface 
was significantly above the 1999/2000 surface, — 1 indicates the opposite, 0 indicates no significant 
difference. Significance level: 5% 


Source: Authors’ calculations based on ENAHO 
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(as defined here) has changed significantly from one time span to another, one 
has first to define an appropriate maximum poverty line and then check whether 
poverty dominance holds at every possible poverty line up to this maximum. 


Robustness to Aggregation Procedures 


Robustness to the aggregation procedure seems to be equally important since the 
weights attributed to different periods are often arbitrary chosen. “Time discount- 
ing’, for instance, might appear to be the most appropriate weighting scheme for 
economists. However, it is empirically very difficult to obtain a reliable and pre- 
cise value for consumers’ discount rates. One therefore needs to be sure that 
the poverty ordering is robust against alternative weights in a reasonable range. 
Variations in the discount rate mean changes in the aggregation procedure across 
periods within a time span. Again, as mentioned above, we chose here a very 
simple way in attributing different weights to different periods. We simply apply 
different poverty lines to period 1 and period 2 within each time span. In other 
words, applying a higher poverty line in the second period than in the first period 
has the same effect than applying a discount rate to period 2 poverty. As will 
be demonstrated now (if the time spans under consideration do not overlap) our 
methodology simultaneously ensures robustness to poverty lines and aggregation 
procedures. Moreover, it ensures of course also robustness to a wide range of 
poverty measures. 


We compare the time span 1998/1999 with the time span 2000/2001. In 
contrast to the procedure illustrated above, now one has not only to check for 
significant differences between the two surfaces at the bisector but at all points 
below and above the bisector up to a reasonable maximum poverty line. This 
becomes clear when looking at Figure 1.4 and Table 1.2. Figure 1.4 shows the 
difference between the two dominance surfaces. A robust poverty ordering would 
require that one surface is above the other surface at all points up to a reasonable 
maximum poverty line. This is obviously not the case here. Table 1.2 illustrates 
this further. Given the many ‘0‘s’ in the grid of test points, it is clear that poverty 
dominance cannot be established for any reasonable set of poverty lines in any 
aggregation procedure. 


To underline the economic relevance of our approach, we now show the spe- 
cific outcomes of weighting period 1 and period 2 differently. We consider poverty 
orderings D which are robust for a broad class of poverty measures, P(F;z;r) and 
a large range of poverty lines, z € Z and discount rates, r c R. Hence, we rely on 


a poverty index P that assesses the degree of poverty, given a two-period distri- 
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Figure 1.4: Poverty in Peru: Difference in Dominance Surfaces (1998/99 - 
2000/01) 


-0.01 
-0.02 
-0.03 
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-0.05 


Differences in dominance surfaces 


-0.06 


-0.07 


HH income p.c. period 2 0.08 


НН іпсоте р.с. регіоа 1 11 


Income: Household income per capita рег day іп РРР US$ 
Source: Authors’ calculations based on ENAHO 


Table 1.2: Poverty in Peru — Difference in Dominance Surfaces (1998/99 - 
2000/01) 


Income Income period 2 
period 1 

10 15 20 25 30 35 40 45 5.0 
1.0 0 0 0 0 0 0 0 0 0 
1.5 0 0 0 0 0 0 0 0 0 
2.0 -1 0 0 0 0 0 0 0 0 
2:5 0 0 0 0 0 0 0 0 0 
3.0 0 0 0 0 0 0 -1 -1 -I 
3.5 0 0 0 0 0 -1 -1 -I 0 
4.0 0 -1 -l 0 -1 -1 4 -1 -1 
4.5 0 -1 - 0 -1 -1 -1 -1 - 
5.0 0 -1 -1 -1 -1 -1 -1 -1 -1 


Income: Household income per capita per day in PPP US$ 1 indicates that the 1998/99 surface 
was significantly above the 2000/01 surface, — 1 indicates the opposite, 0 indicates no significant 
difference. Significance level: 5% 


Source: Authors’ calculations based on ENAHO 
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bution F (уу, уг) when the poverty line is z and the discount factor of subsequent 
periods to the first period of a given time span is r. Therefore, we state that: 


FD\GV P€ 81,2 Е [0,2"],r € [0,7] 


«= F(z,z(1+r))—G(z,z(1+r)) «0Vvze [0,2], ге [0,7"| (1.15) 


where FD С means that multiperiod poverty is lower over time span T; than over 
time span T, with respect to all poverty indices belonging to the class 21,1, all 
(time-constant) poverty lines within the range |0,z"^*] and any weighting factor 
in the range R to discount incomes observed in later periods to the first period 
constituting a time span. 

To illustrate this methodology, we consider the comparison of the time spans 
1998/99 and 2000/2002. The two time spans are of different length, such that 
discounting to the present may be important.Ó The results are shown in Figure 1.5 
and Table 1.3. Table 1.3 has two dimensions. The first dimension corresponds 
to income, and the second corresponds to the discount rate used. That means 
that each cell corresponds to one point of the bisector between income in the 
first and second period of each time span, where income in the second period is 
discounted by the factor (1+r) ", where n is the length of the respective time span 
measured in years. For instance, the ‘1’ in the sixth column of the first row means 
that if incomes of period 2 in each time span are discounted by a factor 1.05 per 
year, multiperiod poverty was significantly higher in 1998/99 than in 2000/01. As 
before, we check at a grid of test points for significant differences of the bisectors 
for a large range of discount rates and poverty lines. Overall, Table 1.3 shows that 
in this comparison, 1998/99 vs. 2000/2002, poverty dominance does hold up to 
a poverty line of $1.2 and a discount rate of r — 0.05, but not beyond. 


Comparisons across socio-economic groups 


Another meaningful example for our proposed concept is to compare multiperiod 
poverty across groups, e.g. socioeconomic categories, within a country. Com- 
paring poverty of employees in the formal private sector with poverty of self- 
employed individuals in the informal sector based on multiperiod stochastic dom- 
inance could yield different findings than a simple comparison on cross-section 
comparisons or multi-period average based comparisons. As before, differing re- 
sults may occur depending on the chosen poverty line and the time-discount rate. 

This is now illustrated for Indonesia and the time span 1993/1997. We ask 
whether intersection poverty was more severe for self-employed than for private 


6One might also argue that past poverty may be more important than present poverty. Hence, 


it could also be useful to consider negative instead of positive discount rates. 
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Figure 1.5: Poverty in Peru: Difference in Dominance Surfaces (1998/99 - 
2000/02) 


Differences in surfaces 


Discount rate 


HH income p.c. 


Income: Household income per capita per day in PPP US$ 
Source: Authors’ calculations based on ENAHO 


Table 1.3: Poverty in Peru — Difference in Dominance Surfaces (1998/99 - 
2000/02) 
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Income: Household income рег capita per day in РРР US$; 1 indicates that the 1998/99 surface 
was significantly above 2000/02 surface, —1 indicates the opposite, 0 indicates no significant 


difference. Significance level: 5%; Source: Authors’ calculations based on ENAHO 
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Table 1.4: Poverty in Indonesia — Differences in dominance surfaces (Self- 
employed - Private sector) 


Income Income period 2 
period 1 


125 15 175 20 2.25 25 275 30 3.25 3.5 3.75 4.0 


2 
1.25 1 0 0 1 1 1 1 0 0 0 0 0 
1.5 1 1 1 1 1 1 1 1 1 1 1 1 
1.75 1 1 1 1 1 1 1 1 1 1 1 1 
2.0 1 1 1 1 1 1 1 1 1 1 1 1 
2.25 1 1 1 1 1 1 1 1 1 1 1 1 
25 1 1 1 1 1 1 1 1 1 1 1 1 
2.75 1 1 1 1 1 1 1 1 1 1 1 1 
3.0 1 1 1 1 1 1 1 1 1 0 0 0 
3.25 1 | 1 1 1 1 1 0 1 0 0 0 
3.5 1 1 1 1 1 1 0 0 0 0 0 0 
3.75 1 1 1 1 1 1 1 0 0 0 0 0 
4.0 1 1 1 1 1 1 1 0 1 0 0 0 


Income: Household income per capita per day in PPP US$; 1 indicates that the 1993/97 surface of the self-employed 
was significantly above the 1993/1997 surface of the private sector employees, —1 indicates the opposite, 0 indicates no 
significant difference. Significance level: 5% 

Source: Authors’ calculations based on ENAHO 


sector employees, or vice versa, regardless of the chosen poverty line and aggre- 
gation procedure. The results are displayed in Table 1.4. Since there are only few 
private sector employees with an income below $1 per person and day, the grid 
starts at the $1.25 poverty line. Ignoring the issue of the aggregation procedure, 
the findings demonstrate poverty dominance of private sector employees over self- 
employed up to a maximum poverty line of $3.25. No matter what poverty line 
up to a poverty line of $3.25 is chosen, one finds more self-employed individuals 
below the poverty line. If discounting is introduced, this result can be approved 
almost over the entire grid, except for choosing the $1.25 poverty line. If the 
$1.25 poverty line is chosen, discounting income in 1997 to the present value of 
1993 could render the differences between the dominance surfaces insignificant — 
as shown by the two ‘0’s’ in line 1. However for any other poverty line between 
$1.5 and $3 poverty dominance holds regardless of the applied discount rate. 


1.3.3 Robust multiperiod relative poverty comparisons for the 
two-period case within and across countries 


We now apply our concept to relative poverty comparisons. To illustrate the idea 
of relative poverty, consider a household that has experienced a significant in- 


crease in income from one period to another and thus moved out of poverty from 
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an absolute perspective. If the income of almost all households in the region has 
risen in a similar way, this household might still be poor from a relative perspec- 
tive, i.e., the poverty gap to the median did not decline. Accordingly, people are 
referred to as ‘chronically poor’ in relative terms if their income, measured as a 
ratio of the median income, stays below a given proportion for consecutive years. 


To test for differences in relative poverty between two time spans, we standard- 
ize household expenditures by a relative poverty line 2, i.e., у = y/Z. We choose 
# = 50% of median income.’ Accordingly, a relative income of 1, for example, 
means that the individual’s income is exactly half of the median income. 


To illustrate the concept of relative multiperiod poverty, we compare two time 
spans in Indonesia, namely the time spans 1993/97 and 1997/2000. The difference 
in relative poverty between these two time spans is presented in Figure 1.6 (note 
that incomes are standardized to 50% of the median income, i.e. a value of 0.8 
corresponds to 40% of the median). The x and y axes measure relative income, 
y, at the beginning and the end of the time spans. The figure does not show any 
systematic pattern. This is supported by Table 1.5, which shows the grid of test 
points. Here the 0 in the third row of the third column, for example, means that 
the share of individuals who had less than 50% of the median income (¥ = 1) did 
not significantly change between the time spans 1993/97 and 1997/2000. Hence, 
no conclusions about changes in multiperiod poverty can be drawn. 


Our concept of relative poverty orderings is also applicable to cross-country 
comparisons. Absolute poverty comparisons using some agreed international 
poverty line are interesting if countries have comparable and rather low living 
standards. But for countries with very different living standards or for very rich 
countries, relative poverty might be more relevant. To illustrate this, we now com- 
pare Peru to Indonesia. Peru has a median income of 4.7$ PPP and Indonesia of 
3.7$ PPP per person per day. For these two countries, we consider the time span 
1997/2000 with income observations in 1997 and 2000 for each. 


Table 1.6 shows the matrix of test points of differences of the two-period 
poverty surfaces (‘Peru minus Indonesia’). Relative poverty is higher in Peru. 
Even though dominance cannot be established over the entire domain, the maxi- 
mum poverty set for relative dynamic poverty is wide enough to conclude domi- 
nance. The proportion of poor individuals is higher in Peru no matter what ‘rea- 
sonable’ relative poverty line or aggregation procedure is chosen. 


7Note that it does not matter which share of the median is used as poverty line. 
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Figure 1.6: Relative poverty in Indonesia: Difference in Dominance Surfaces 
(1993/97 - 1997/2000) 


Differences in dominance surfaces 


HH income p.c. period 2 


Income: Household income per capita per day in PPP US$ 
Source: Authors’ calculations based on IFLS 


Table 1.5: Relative Poverty in Indonesia — Difference in Dominance Surfaces 
(1993/97 - 1997/2000) 


Income Income period 2 
period 1 
08 09 10 11 12 13 14 15 16 17 18 19 20 

0.8 1 1 0 0 0 0 0 1 1 1 1 1 1 
0.9 1 0 0 0 0 0 0 1 1 1 1 0 0 
1.0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.1 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.2 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.3 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.4 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.5 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.6 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.7 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.8 0 0 0 0 0 0 0 0 0 0 0 0 0 
1.9 0 0 0 0 0 0 0 0 0 0 0 0 0 
2.0 0 0 0 0 0 0 0 0 0 0 0 0 0 


Income is household income per capita per say in US$, standardized by a relative poverty line, 2 = 
50% of median income: Income = Income /Z; 1 indicates that the 1993/97 surface was significantly 
above the 1997/2000 surface, —1 indicates the opposite, 0 indicates no significant difference. 


Significance level: 5%; Source: Authors’ calculations based on IFLS 
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Table 1.6: Relative Poverty in Peru and Indonesia — Difference in Dominance 
Surfaces Peru (1997/2000) - Indonesia (1997/2000) 


Income Income period 2 
period 1 
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Income is household income рег capita per say in US$, standardized by a relative poverty line, 
Z = 50% of median income: Income = тсоте/5; 1 indicates that the Peru surface was signifi- 
cantly above the Indonesia surface, — 1 indicates the opposite, 0 indicates no significant difference. 
Significance level: 5%; Source: Authors’ calculations based on ENAHO and IFLS 


1.3.4 Robust multiperiod poverty comparisons for the n-period 
case 


Obviously, poverty comparisons over two time spans demand panel data over 
multiple periods. Consequently, the question arises how the time spans under 
consideration should be constructed if more than two periods are available within 
each time span. Which period should be the end of the first and the beginning of 
the second time span? How many periods should constitute a time span? These 
are very general questions regarding the measurement of multiperiod poverty (or 
chronic poverty more specifically). Depending on the panel data available, of- 
ten several different time span constructions are possible, varying in time span 
length and the number of periods taken into account. This raises the question, for 
example, whether comparisons should be made with the maximum overlap (e.g., 
Talyı,y2,---,Yn-ı] vs. Тр[у2, Уз». · yn]), without any overlap (e.g., Та[у1,У2, -· -„Ул/2] 
VS. Tp[Yn/2+1>Yn/2+2>:-»)n]), ог with something in between. Depending on these 
choices, poverty orderings may differ. Thus, beyond robustness to poverty in- 
dices, poverty lines and aggregation procedures, one may also require poverty 


comparisons to be robust to the construction of the time spans. 
Johannes Grab - 978-3-653-00480-9 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


1.4. DISCUSSION 31 


To illustrate this, we use five waves of the Peruvian household panel data 
(1998-2002). To simplify the exposition, we require that in each comparison, 
the first period of time span 74 is 1998 and the last period of time span Тв is 
2002. We also abstain from making comparisons for different time span lengths. 
However, all remaining decisions regarding the construction of these time spans 
are arbitrary and consequently, any poverty ordering may depend on how exactly 
the construction is carried out. We think, there are at least five different compar- 
isons that make sense from an economic point of view: three where we consider 
time spans comprising two periods, one where we consider time spans comprising 
three periods, and one where we consider time spans comprising four periods: 


[1998: 2000] vs. [2000; 2002] 

[1998; 1999] vs. [2001; 2002] 

[1998: 2001] vs. [1999; 2002] 

[1998: 1999; 2000] vs. [2000: 2001: 2002] 
[1998: 1999; 2000; 2001] vs. [1999: 2000: 200 1: 2002] 


Given the difficulty in determining which of these five comparisons is most 
appropriate, dynamic poverty comparisons should be robust to all of them. For 
example, one can imagine a case in which the $3 poverty line is considered to 
be a reasonable maximum poverty line when comparing poverty dynamics for the 
time span 1998-2002 in Peru. In this case, the poverty ordering is only considered 
robust if poverty dominance can be established for every possible poverty line up 
to the $3 poverty line and for every above-mentioned type of construction for the 
time spans. 

Table 1.7 shows the results of such a dominance test. Obviously, according 
to our proposed methodology, no significant ordering of poverty dynamics can be 
established for the time span 1998-2002. This is a very interesting result given 
the large number of 1’s in Table 1.7. Suppose the objective is to assess chronic 
poverty for the time span 1998-2002. Using the $2 poverty line and comparing 
the time spans [1998; 1999; 2000] and [2000; 2001; 2002] – which might be judged 
areasonable comparison at first glance — one would conclude that chronic poverty 
has fallen. However, taking the time spans [1998; 1999] and [2001; 2002] shows 
instead that no conclusion can be drawn. Hence, the poverty ordering depends not 
only on the chosen poverty line but also on the way the time spans are constructed. 


1.4 Discussion 


In this paper, we presented a concept allowing to undertake multiperiod poverty 
comparisons over time and space without arbitrarily aggregating income over var- 


ious years. Inspired by the multidimensional stochastic dominance methodology 
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Table 1.7: Poverty in Peru — Difference in Dominance Surfaces for Several Con- 
struction Modes of Time Spans 


Income [98;00] [98:99] [98:01] [98:99:00] [98;99;00;01] 


VS. VS. VS. vs. vs. 
[00:02] [01,02] [99:02] [00:01:02] [99;00;01;02] 

0 0 0 0 0 

15 0 0 0 0 0 

2 0 0 1 1 1 

2.5 1 0 1 1 1 

3 1 0 0 0 1 

3.5 1 0 0 0 0 

4 0 0 0 0 0 


Income: Household income per capita per day in US$; 1 indicates that the earlier surface was 
significantly above later surface, —1 indicates the opposite, 0 indicates no significant difference. 
Significance level: 5%. Source: Authors’ calculations based on ENAHO 


elaborated by Duclos et al. (2006b), we created n-period income surfaces for dif- 
ferent time spans. These surfaces were then ordered using dominance tests. Once 
dominance is established, the poverty ordering is robust to a wide range of poverty 
indices, to a wide range of poverty lines, and to a wide range of aggregation pro- 
cedures. Furthermore, we extended our framework to the measurement of relative 
poverty. 

To illustrate our methodology, we compared poverty across time spans in Peru 
and between Peru and Indonesia. Furthermore, we highlighted some general prob- 
lems of dynamic poverty comparisons, i.e. how time spans should be constructed, 
namely which period should be the end of the first and the beginning of the sec- 
ond time span and how many periods should constitute a time span? We dealt 
with these questions by applying robustness test with respect to various of these 
possibilities. 

However, the approach suggested and the ideas developed in this paper also 
have their limitations. The most important of these is certainly that all results are 
based on a sample of expenditures declared by households and that these decla- 
rations are generally affected by measurement error, which affects the bivariate 
distribution F(yı,y2) (and n-variate distribution) much more than the univariate 
distribution F (y). In fact, many empirical studies show that measurement error is 
such that the extent of B-convergence over time is overestimated (see Bound et 
al. (2001); Breen and Moisio (2004); Grimm (2007)). For our case, this would 
imply that multiperiod poverty is underestimated. In the absence of information 
on ‘true income' or any instruments, there is not much that can be done about this, 
but it should be kept in mind when interpreting our results. However, the problem 
is obviously not specific to our approach but inherent in most approaches to the 


analysis of poverty dynamics. 
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Essay 2 


Spatial inequalities explained - 
Evidence from Burkina Faso 


Abstract: Empirical evidence suggests that regional disparities in incomes are 
often very high, that these disparities do not necessarily disappear as economies 
grow and that these disparities are itself an important driver of growth. We use a 
novel approach based on multilevel modeling to decompose the sources of spatial 
disparities in incomes among households in Burkina Faso. We show that spatial 
disparities are not only driven by the spatial concentration of households with par- 
ticular endowments but to a large extent also by disparities in community endow- 
ments. Climatic differences across regions do also matter, but to a much smaller 
extent. 


based on joint work with Michael Grimm. 
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2.1 Introduction 


Empirical evidence suggests that regional disparities in growth and poverty are 
often very high, that these regional disparities do not necessarily disappear as 
economies grow and develop and that these disparities are itself often an impor- 
tant driver of the overall performance of an economy.! Often such regional in- 
equalities are closely linked to key policy choices (e.g. trade policy) and patterns 
of public spending. But in most cases lagging regions also suffer under infrastruc- 
ture bottlenecks, adverse agroclimatic conditions, import competition and limited 
scope for non-agricultural activities. 

Burkina Faso is one among many Sub-saharan African countries where the 
regional pattern of living standards is particularly puzzling. Some of the observed 
inequality can be related to cotton production given that cotton is the main export 
commodity of the Burkinabe economy. However, despite the cotton boom which 
Burkina Faso knew in the middle and end of the 1990s, some cotton producing 
provinces did grow slower than other non-cotton provinces. In particular the tra- 
ditionally poor and arid North of the country knew a quite good development 
during that time. Hence, from these observations it is difficult to guess to what 
extent agro-climatic factors, trade exposure and population structure matter for 
disparities in the level and change in living standards. Explaining where such dis- 
parities come from could help to design development strategies and interventions 
to reduce them in a cost-effective way. 

Standard poverty assessments usually address such issues simply by under- 
taking a rather descriptive analysis of growth patterns across regions and by per- 
forming decompositions of inequality indices by regional units. However, such 
decompositions make it very difficult to disentangle what is due to heterogene- 
ity in household characteristics and what is due to heterogeneity in area-specific 
characteristics or endowments. In other words poor areas could simply be poor 
because households with poor endowments are geographically concentrated. 

To deal with this problem, Ravallion and Wodon (1999) relied on two con- 
secutive cross-sections of household survey data for Bangladesh to run separate 
regressions for each year and for each of the urban and rural sectors. They in- 
cluded a wide range of household characteristics and attributed the remaining part 
of the observed variance to geographic effects. They then undertake a number of 
robustness checks to exclude that there is a bias due to omitted household charac- 
teristics which are spatially correlated. The authors conclude that there are size- 
able spatial differences in the returns to given household characteristics, i.e. the 
same household might be poor in one but not in the other region. 


'The ‘Operationalizing Pro-Poor Growth Project’, for instance, which was coordinated by the 
World Bank and British, French and German donors, shows various cases in point (see Besley and 


Cord (2007); Grimm et al. (2007)). 
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Another approach was chosen by Jalan and Ravallion (2002) and later by De 
Vreyer et al. (2009). They used several waves of panel-data to implement a quasi- 
differencing method to identify the impact of locally determined geographic and 
socioeconomic variables on household’s consumption growth while removing un- 
observed household and community fixed effects. These authors find, for rural 
China and Peru respectively, robust evidence of geographic poverty traps and 
highlight in particular the socio-economic features of villages and the provision 
of public goods, such as rural roads, as important area-specific determinants. 

Benson et al. (2005) have used alternatively spatially regression and geograph- 
ically weighted regression techniques to allow regression error terms to be spa- 
tially correlated and to assess the degree to which determinants of poverty and the 
prevalence of poverty vary across space. For rural Malawi the authors find not 
much evidence for local poverty traps, characterized for instance by low agricul- 
tural productivity, and emphasize that the determinants of poverty vary spatially 
in their effects across the country. However, they find some evidence that regions 
with more opportunities for non-agricultural earnings and more markets, public 
infrastructure and services show less poverty. 

While all these studies suggest that poverty reduction efforts have to be tar- 
geted atthe sub-national level, they do not provide a decomposition of the variance 
in living standards observed within and between spatial units. In this paper we 
suggest a novel methodology to address this issue. We build a multilevel random 
coefficient model able to decompose the variance in living standards across four 
spatial levels; households, communities, provinces and (agro-climatic) regions.” 
Moreover, our model allows to decompose the variance measured on each level 
in a component accounting for the variance in level-specific characteristics and 
components accounting for a sorting of lower-level characteristics across these 
levels. For instance, the variance in households’ living standards between com- 
munities might be driven by the variance in community-specific endowments and 
by a sorting of households with favorable and unfavorable characteristics across 
communities. 

To implement our approach for Burkina Faso, we build a very detailed and ex- 
haustive data set combining household living standard measurement survey data, 
population census data, agricultural survey data and a number of statistics col- 
lected at the provincial level. 

The remainder of our paper is organized as follows. In Section 2 we describe 
spatial inequality and its development over time in Burkina Faso. In Section 3 we 
present our data and the empirical strategy. In Section 4 we discuss our results. In 
Section 5 we conclude. 


? Similar techniques have been applied by Bolstad and Manda (2001) and Ecob (1996) to study 


spatial inequality in child mortality and health. 
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2.2 Regional Growth and Inequality in Burkina Faso 


Burkina Faso is one of the poorest countries in the world. GDP per capita is 
estimated at only PPP US$ 1,213 and according to the Human Development In- 
dex, the country was ranked 176th out of 177 countries (UNDP, 2007). It is a 
landlocked country in the middle of West-Africa with a population of roughly 
13.4 million. It has a very low human capital base and only very few natural re- 
sources. The country depends highly on cotton exports, which account for almost 
60 percent of total export earnings, as well as on international aid. More than 80 
percent of the Burkinabe population lives in rural areas working predominantly in 
the agricultural sector, which suffers from very limited rainfall and recurrent se- 
vere droughts. The country experienced sustained growth with moderate poverty 
reduction during the last 15 years however accompanied by important variations 
over time and space (Grimm and Günther (2007)). 

If income levels and growth rates as well as poverty shares are compared 
across Burkina’s 13 regions (see Table 2.1)? one can state that the Western re- 
gions, where the bulk of cotton is produced - Hauts Bassins, Mouhoun and Cas- 
cades - are richer than the remaining regions (abstracting from the two urban cen- 
ters Ouagadougou and Bobo-Diolassou). However, in terms of growth in the sub- 
sequent period, the non-cotton and initially very poor Eastern regions - Sahel, Est 
and Centre-Nord - performed better than all cotton regions, despite the very favor- 
able development of cotton exports and the widespread belief that cotton exports 
were the driver of Burkina Faso’s growth. In terms of poverty, Hauts-Bassins 
has still, given its relatively high income level (by Burkinabe standards) moderate 
poverty without however any significant poverty reduction since 1994. Mouhon, 
another of the important cotton regions, had ever and has still very high poverty 
levels. The cotton region Cascade achieved to halve poverty between 1994 and 
2003 (Grimm and Günther (2007)). 


3The household survey data is presented in detail in Section 3. 
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To see ifthe observed pattern of economic growth and poverty reduction fol- 
lows a similar pattern on the provincial level, i.e. to see whether provinces in 
a given region develop similarly, we further disaggregate the data according to 
Burkina Faso’s 45 provinces. The results are presented using maps (Figure 2.1). 
These maps indicate two important aspects. First, neither does economic growth 
occur on some widespread regional level nor does there seem to be a high regional 
concentration of poverty. The intensity of growth and poverty rather varies across 
provinces over the whole country. Second, the set of provinces with the highest 
poverty incidence changes over time. Similar to what Benson et al. (2005) have 
found for rural Malawi, there do not seem to be spatial poverty traps in Burkina 
Faso. 


Figure 2.1: Growth and Poverty Incidence on Provincial Level 


= __| ш 
A m. J 


Source: EPI, EP2, EP3, estimations by the authors 


If we disaggregate our data further by the 135 districts (Départements) which 
are covered by the household surveys? and plot household expenditures per capita 
in 1994 against growth of household expenditures per capita over the period 1994 


^In total Burkina Faso has 301 districts (Départements). 
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to 2003, the data suggest B-convergence in living standards across these local 
units. However such kind of convergence might be exaggerated if expenditures 
per capita are measured with error (see e.g., Sala-i-Martin (1996)). Although we 
provide below some evidence why such convergence could have occurred, we 
do not find robust empirical evidence for these channels and we cannot rule out 
that measurement error plays an important role. First, because we do not find 
evidence for o-convergence, which would be immune to the measurement error 
problem (see e.g., Sala-i-Martin (1996)). Second, we find a much smaller В- 
convergence coefficient if we regress the growth rate of expenditures from 1998 
to 2003 on expenditure levels in 1994, which again could be sign of measurement 
error. However one should note that 1998 is a very particular year, since the 
1997/98 harvest was affected by a severe drought, even by Burkinabe standards. 


Figure 2.2: Convergence in Burkina Faso, initial per capita income and growth on 
the department level (135 observations), 1994-2003 
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Hence, the question arises how income disparities between households and 
across spatial units can be explained. What is the contribution of the variance in 
household characteristics and level-specific endowments such as public services, 
infrastructure and climate? To what extent does the spatial clustering of house- 
holds play arole? Are the effects of relevant factors similar across spatial units or 
do they vary significantly across the country? Answers to this kind of questions 
have not yet been given for Burkina Faso, but seem crucial to appropriately target 
poverty alleviation strategies. The only study we have found that did research in 
that direction for the case of Burkina Faso is Bigman et al. (2000). Similar to 


our study, the authors use a very detailed data set combining information from the 
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household, village, district and provincial level and construct a poverty map on the 
level of villages. From that map the authors conclude that differences in the inci- 
dence of poverty among regions are primarily due to differences in agro-climatic 
conditions, whereas differences in the incidence of poverty among villages within 
the same region do often reflect past policy biases that led to differences in the 
quality of roads or public services. 


2.3 Data and Empirical Strategy 


2.3.1 Data 


Burkina Faso is organized in 13 agro-climatic regions, 45 provinces and 301 dis- 
tricts (départements). It has 26 cities and towns (population > 5,000) and roughly 
9,000 villages. According to the last census in 2006 the urbanization rate was 
about 16 percent and the average population density 48.4 persons per km?. The 
two major cities are Ouagadougou, the capital, with a population of roughly 1.1 
million and Bobo-Dioulasso with a population of about 0.4 million. The third city, 
Koudougou only has a population of 83.4 thousand.° The variables we use have 
been collected from a large number of sources and on different levels of that or- 
ganizational structure. However, it was very difficult to find and get access to data 
on agro-climatic characteristics, infrastructure and public services and if it existed 
to match these data to other sources. This seems to be a problem in many of the 
least developed countries and may explain why only very few attempts have been 
made so far to analyze the effects of area-specific characteristics on households’ 
living standards. 

First, household data is drawn from three nation-wide representative house- 
hold surveys, the Enquéte Prioritaires (EP), conducted in 1994 (EP I), 1998 (EP 
II) and 2003 (EP III) covering around 8,500 different households in each year. 
These surveys were conducted by the Institut National de la Statistique et de la Dé- 
mographie (INSD) with technical and financial support of the World Bank. These 
surveys contain relatively detailed information on household’s socio-demographic 
characteristics, education, employment, agricultural and non-agricultural activi- 
ties as well as consumption, income and some assets.® 

Given the usual low quality of income data in poor rural settings, we use 
household expenditure per capita as an indicator of households’ living standards. 
Expenditures were deflated over time and space using appropriate price deflators. 
A critical issue in our study are of course the deflators used to correct for price 
differences across space. For this purpose we use deflators provided by the INSD 


>Statistics taken from INSD, see http://www.insd.bf. 


6A detailed description of these data sets can be found in Grimm and Günther (2007). 
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in each survey year for Burkina Faso’s 13 regions (based on price data collected 
on 37 different regional markets). 

Second, we can draw data on the community (or cluster) level from several 
sources. Although, except in 1998, the above mentioned household surveys were 
not linked to any village survey, the questionnaires contain some questions re- 
garding the time needed to reach the next primary and secondary school, the next 
health center, road, market and drinking water point. In 1998 a specific commu- 
nity survey was added to the household survey which collected further community 
data for 325 of the 425 communities covered by the survey. Further community 
variables were constructed simply by aggregating household characteristics at the 
community level. However, a community panel cannot be constructed because 
each survey year does not cover exactly the same communities. 

Third, data on the size of agricultural production units, fertilizer use and the 
use of modern production technologies in agriculture are drawn from a yearly agri- 
cultural survey called Enquéte Agricole. This survey is conducted by the Ministry 
of Agriculture in collaboration with INSD. Since the data set uses a different sur- 
vey design than the EPs, we merged the information to the other data sources on 
the provincial level, the smallest common regional unit. The average size of agri- 
cultural production units, fertilizer use and information about modern production 
technologies are therefore provincial averages. 

Fourth, data on agro-climatic conditions such as monthly rainfall for the period 
1993-2006 on the provincial level, and monthly minimum and maximum temper- 
atures on the regional level were obtained from the Directorate of Meteorology 
(Direction de la Meteorologie). 

Fifth, data on the provision of public services, infrastructure and population 
densities, also at the provincial level, were obtained from the Ministry of Infras- 
tructure (Direction Génerale de l'Amenagement du Territoire). Note that we do 
not have any data on project aid, hence the effect of aid will be in the unobserv- 
ables. 

Hence, as stated above, the data set we use is organized in four levels: the 
household, the community (cluster), the province and the region. Table 2.2 shows 
all used variables along with their means and standard deviations and their source. 
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2.3.2 Empirical Strategy 


To analyze the determinants of income levels and to decompose the variance in 
income levels across spatial units, we use a multilevel (also hierarchical or mixed) 
regression model.’ Multilevel models are widely used in social science, sociology 
and health research to specify the effect of social context on individual level out- 
comes.® Due to the often observed lack of hierarchical data and probably due to 
the very time consuming estimation procedure, multilevel models are less popular 
in economics than in these other disciplines.? 


A multilevel model 


A multilevel model can be best described by beginning with a two level random 
coefficient model with only one explanatory variable. The idea of the model is, 
that the regression coefficient on the first level (e.g. households), i, is treated as a 
random variable at the second level (e.g. communities), j. 

The model equation reads: 


Yij = Boj + Bi jXij +. (2.1) 
The regression coefficients Во; and В; ; can be expressed as: 


Bo; = Yoo + Uoj (2:2) 


Bij = Yo Uij (2.3) 


Equation 2.2 shows that for each unit j on the second level, a specific inter- 
cept, Шоу, is introduced into the model. These intercepts are however not directly 
estimated as a fixed coefficient within the model. Multilevel models estimate the 
variance of these Шуу. They are therefore often referred to as random intercepts. 
Equation 2.3 shows that a specific B-coefficient, ИЛ ;, is introduced allowing the 
effects associated with the covariates to vary across units on the second level. 
Since only the variance of these coefficients is estimated, it is referred to as a 
random coefficient. Models that do only include random intercepts are called ran- 
dom intercept models, while models that include random intercepts and random 
coefficients are called random coefficient models. 


"For a comprehensive overview of the statistical theory underlying multilevel modeling and 
of various illustrative applications, see e.g. Goldstein (2003) and Hox (1995) 

8For a good overview of applications in that area, see DiPrete and Forristal (1994). 

?Economists rely on these models in particular for out of sample predictions to perform small 
area estimations, for instance to construct a poverty map (see Elbers et al. (2003) and Jiang and 
Lahiri (2006)). A paper which deals with causal multilevel models is, for example, Aassve and 
Arpino (2007). 
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Finally, the combined model can be expressed as consisting of a fixed part 
(first term) and a random part (second term): 10 


Yij = (Yoo + NoXij) + (Uoj + Ui jXij + £i) (2.4) 


It is straightforward to extend the model to more than two levels. The model can 
also be used to check for significant variation of the random intercepts and slope 
coefficients across units on each level. Moreover, it is possible to analyze the 
covariance of the random intercepts and slopes. 


Strengths of a multilevel model 


Multilevel models offer several advantages over other models. They allow to com- 
bine nested data from different sources, to decompose variation across levels and 
to model the variation of effects across spatial units. In what follows we discuss 
each of these advantages. 


Efficient Estimation 


Since we built our data set using several different and independent data sets, vari- 
ables are observed on multiple nested levels (see Table 2.2). Clustering stemming 
from this nested structure requires to account for intra-group correlations. Under 
the assumption that individuals and households on the same level are more alike 
than individuals and households from different levels, within group residuals are 
likely to be correlated. Applying standard OLS regression to nested data leads to 
a wrong of standard errors and, hence, statistical inference can be wrong. In a 
multi-level data set the unexplained variance should be decomposed into the vari- 
ance on all nested levels. This is exactly done by the multilevel model allowing to 
obtain efficient estimates (see Goldstein (2003)). 


Variance partitioning 


In a multilevel random intercept model, the decomposition of the error term al- 
lows to assess how much of the total variance is attributable to variation on the 
different nested levels. Moreover it can be assessed how much of the variance 
measured on each level is due to the variance in level-specific characteristics and 
how much is due to sorting of lower-level characteristics across these levels. For 
instance, the variance in households’ living standards between communities might 
be driven by the variance in community-specific endowments and by a sorting of 
households with favorable and unfavorable characteristics across communities. 


10Fixed effects are hereafter denoted as coefficients which are directly estimated by the model. 


For random effects only the variance and its standard error is estimated. 
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More precisely, sticking to this two-level example, we can answer the following 
questions: 


1. How much of the total variance in incomes between households is attributable 
to differences between communities? 


2. How much of the variance between communities can be explained by differ- 
ences in observed household characteristics between theses communities? 


3. How much of the variance between communities can be explained by dif- 
ferences of observed community characteristics? 


The contribution of the variance at each level to the total variance can be mea- 
sured with the so-called ‘variance partition coefficient’, also called the ‘intra-class 
correlation coefficient’ (‘icc’, hereafter), p.!! Since a multilevel model implicitly 
assumes errors to be independently distributed across levels, the total variance of 
the dependent variable can be decomposed as the sum of the variance on each 
level. If we use again the two-level model as an example, the decomposition of 
the variance by level reads: 


var(Y;;|X;;) = var(Uo;) + var(&;;) = оў, + 02. (2.5) 
Accordingly, the icc of the second level can be expressed by: 


__ in (2.6) 
iE 02 o2 | 


The intra-class correlation coefficient measures the correlation of the residual 
of the response variable of households stemming from the same community. A 
high p in equation 2.6 would point to a large impact of the second level, for 
instance the community, on first level outcomes, i.e. on the level of households. 

Finally, the decomposition allows to draw conclusions on the explanatory 
power of the used covariates with respect to the variation on the different levels 
(see Borgoni et al. (2002)). For instance, we can answer the question whether the 
observed spatial pattern in income levels can rather be explained by differences 
in regional variables, like geographic traits, by differences in community charac- 
teristics like access to certain public goods or rather by differences in household 
characteristics, like household size and education. This is a major conceptual ad- 
vantage of a multilevel model. If we ran a household income regression with ex- 
planatory variables on higher levels, but without a multilevel structure, significant 


Пус is called ‘intra-class correlation coefficient’ since it measures the degree to which observa- 


tions in the same unit of a given level, e.g. households within a given community, are dependent. 
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coefficients of these variables are likely to pick up variation which is at least partly 
due to omitted household level variables. In contrast, if we introduce a random in- 
tercept on each level, we can test the explanatory power of level-specific variables 
on each level separately. Whenever an introduced variable reduces the variance 
of the level-specific error term, we can conclude that this variable explains part of 
the variance in incomes on that level (see Ecob (1996)). 


Area-specific returns 


A multilevel model designed as a multilevel random coefficient model (‘RC’ here- 
after), allows to take into account a possible variation in the factor coefficients 
across spatial units. Finding significant variation in the effects of individual char- 
acteristics across spatial units suggests that area modifies the association between 
individual characteristics and income (see Merlo et al. (2005b)). In our case, for 
instance, it will be interesting to see whether effects associated with education, 
cotton cultivation or household composition are constant across spatial units. 


Covariance structure of random effects 


Finally, the RC model allows us to investigate the covariance structure of the 
random intercepts and random slope coefficients. For instance, it might be that 
communities with lower average income levels (a lower intercept) have higher 
returns associated with education or cotton cultivation. A significant negative 
correlation, for example, could explain the convergence described in Section 2. 


To conclude, based on these methodological considerations, we believe that a 
multilevel model is particularly suitable to identify the sources of spatial inequal- 
ities. Our methodology is capable of decomposing spatial inequality into the con- 
tribution of household and area-specific characteristics, of identifying the key spa- 
tial determinants of inequality and of tracking variations in returns across space, 
thereby preserving simultaneously most of the advantages of the methods used by 
Ravallion and Wodon (1999), Jalan and Ravallion (2002) and Benson et al. (2005). 
Complementing the geographical analogue of the Oaxaca-Blinder decomposition 
proposed by Ravallion and Wodon (1999), our decomposition methodology al- 
lows to attribute weights to the contribution of the various levels to total inequal- 
ity. Moreover, in addition to the identification of higher level variable effects on 
household income, which is done in Jalan and Ravallion (2002) using a GMM- 
type approach, our model differentiates in principle between significant higher 
level effects explaining higher level inequality and significant higher level effects 


just picking up omitted household characteristics. 
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Obviously, our methodology also has some drawbacks. In the absence of panel 
data, we cannot exclude that we run with some of our explanatory variables into 
endogeneity problems. However, the methodology we propose is just as applica- 
ble to panel data as it is to cross sectional data. Multilevel models are also often 
criticized for inconsistent parameter estimation. However, we are not particu- 
larly focussing on consistent parameter estimation but on variance partitioning. 
It should be noted moreover, that due to the few observations that we observe 
per first level unit (maximum 20 households per community), introducing dummy 
variables for each higher level unit to satisfy the independence assumption would 
lead to a significant over-parametrization (Lombardia and Sperlich, 2007). Si- 
multaneously, effects of all higher level variables, which are key for our analysis, 
could not be identified. Thus, we will construct our multilevel model in a way that 
we can benefit from all the advantages of a multilevel model while using our large 
data set to control as much as possible for unobserved heterogeneity. 


Modeling Strategy 


We use an iterative procedure to estimate the sources of spatial inequality. We 
start with a multilevel random intercept model (M0), that will not include any 
covariates. We will then iteratively introduce household level variables (M1), 
community variables (M2) and provincial and regional variables (M3) into the 
model.!? At each stage, our main concern is about two questions: 


1. What are the key characteristics determining per capita income disparities? 


2. To what extent are the characteristics responsible for the spatial variation 
observed on each level? 


Finally, we will augment our multilevel model in section 2.4.5 by allowing 
coefficients of household characteristics to vary across communities and by mod- 
eling the covariances of the random effects on that level (M4). Investigating the 
variance of the random coefficients and the correlation between random intercepts 
and slopes, the model can help answer the following questions: 


!2Multilevel modeling does only control for unobserved heterogeneity on each level as long 
as the independence assumption between unobserved characteristics and the regressors holds. In 
the context of hierarchical data, multilevel models assume area effects to be independent of the 
covariates and any unobserved individual effects. 

PFollowing such an iterative modeling procedure has some drawbacks. Model М1 and M2 
may suffer from dependency with observed but non-included higher level variables. Results should 
therefore - as will be mentioned later on - be interpreted with caution. Alternatively, one could run 
M3 right after MO. Then, it would be possible to estimate directly the contribution of the set of 


observed variables of each level to the reduction of each variance component. 
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3. Does area modify the association between household characteristics and 
respective outcomes? 


4. What might be the explanation for such a modification? 


We estimate our model for three points in time: 1994, 1998 and 2003. This 
will also allow to get some insights into the dynamics of spatial inequality and its 
determinants. Our full four level random coefficient model reads: 


P Q R M 
Уи = (10000 + У, УроооХреја + У, ТодооСаји + У, YooroPsia + У, оты) 
pl а=1 r=1 m=1 
(2.7) 
P 
F(Wi + Ver + Оја + У, ПрјаХруи + уы) 
p=1 


where i stands for households, j for communities, k for provinces and / for regions. 
X,C, P and R are vectors of household, community, provincial and regional char- 
acteristics, respectively. W; is the regional random intercept, Vj; the provincial 
random intercept and И the community random intercept. The models will be 
estimated using Stata and its implemented mixed model command ‘xtmixed’.!* 


2.4 Results: Sources of Spatial Inequality 


241 Model МО; The null model 


For each year for which we estimate our model, we begin by a four level null 
model where we introduce nothing but a random intercept on the community, the 
provincial and the regional level. Using a likelihood ratio test we check whether 
the three level model, nested in the four level model, performs better than the four 
level model (see Goldstein (2003)). Since this is not the case for any of the three 
years under consideration, we will use a four level model in the following. 

Our base model, MO, reads: 


Уа = 0000 + W + Vig +U jki + iji; Q.8) 


where У; ду stands for log of household expenditure per capita. The results of 
model MO for each year are shown in tables 2.3 - 2.5. 


14Тһе estimation procedure is based on an iterative generalized least squares approach (dis- 
cussed in Goldstein (2003)). This procedure starts with the estimation of the fixed effects coef- 
ficients using ordinary least squares. The resulting residuals are stored. Afterwards, an iterative 
procedure begins, starting with a generalized least squares regression in a first step. Then, in a 
second step the residuals of this regression are used to compute the variance of the random coeffi- 


cients. These steps are then iterated. 
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Table 2.3: Models - 1994 - Fixed effects 


MO 
Household level 
HHsize 
Children Adult 
Youth Adult 
Elderly Adult 
Age 
Sex 
Literate Head 
Literate Adult 
Cotton 
Livestock 
Muslim 
Christian 
Mossi 


М1 
-0.040 
-0.054 
-0.060 
-0.006 


0.038 
0.440 


0.056 


жж 


жжж 


жж 


жж 


LIII 
LIII 


жжж 


M2 
-0.040 
-0.053 
-0.055 
-0.006 


0.036 
0.367 


0.048 


LIII 


Жжж 


LIII 


LIII 


LLLI 


жж 


жж 


M4 
-0.042 
-0.042 
-0.238 
-0.004 


0.260 
0.337 


0.018 


жж 


Community level 
ZD Religion 

ZD Ethnicity 

ZD Cotton 

ZD Livestock 

ZD Literate Adult 
ZD Literate Head 
ZD Hhsize 

ZD Children Adult 
ZD Youth Adult 
ZD Elderly Adult 
Electricity 

ZD Urban 
Primary Access 
Secondary Access 
Healthcenter Access 
Market Access 
Provincial level 
Landsize 

Rain 

Pop. Density 
Tarred Road 

Size 


0.039 


0.549 


-0.092 


0.169 
0.164 


жжж 


LI 


LLLI 


0.030 


0.390 


-0.029 


0.176 
0.144 


жжж 


Regional Level 

Ltempmax 

Constant 11.080 
AIC 19423 
LR test 0.000 
Obs 8595 


жжж 


11.590 
17065 


8595 


жж 


11.350 
16780 


8595 


Жжж 


11.330 
16187 
0.000 

8595 


Source: Authors’ calculations based on individual dataset 
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To obtain the contribution of the variance at each level to the total variance, 
we calculate the icc for each level. Recall that the icc (e.g. for level 2) is written 
as: б 

O 
= (2.9) 
02+02+02+07 


p 


The icc for all years and levels are shown in Table 2.6. For instance, the 
intra-class correlation coefficient, p, for the community level in the year 1998, 
is equal to approximately 26.5%.!° In words, in 1998 26.5 percent of the total 
variance is situated at the community level. In this case, the icc measures the 
correlation of the residual of the response variable of households stemming from 
the same community. The high icc of the community level, which is almost as 
high in 1994 (19.2 percent) and 2003 (20.5 percent), depicts two things. First, it 
underlines the importance of using a multilevel approach to get efficient estimates. 
Second, it suggests strong community effects which are relatively stable over time. 
The latter finding is particularly interesting in our case, since it means that the 
more households’ incomes within a community are alike, the more likely is it that 
incomes are directly related to the contextual environment of the communities 
(see Merlo et al. (2005a)). 

Clearly, most of the variance exists at the household level. It should be empha- 
sized, however, that household expenditure data in developing countries is usually 
measured with error, given that it is generally very difficult to get precise infor- 
mation on expenditures if simple recall questions are used. Our model attributes 
the total variance which is due to measurement error in the expenditure data to 
the household level component. If we were able to account for these errors, the 
contribution of the household level variance to the total variance would probably 
be lower, and, in consequence, the contribution of the higher levels higher. The 
contribution of the variance on the provincial and regional level is relatively low. 
We conclude - at this stage — that differences in household incomes are mainly 
driven by household and community (or cluster) characteristics and to a smaller 
extent by regional characteristics. The contribution of the provincial level is very 
low. In fact, in Burkina Faso regions rather than provinces follow agro-climatic 
zones, this can explain why regions make a higher contribution than provinces. 

As explained above the finding of a significant contribution of higher level 
characteristics on income does not necessarily have to be the result of differences 
in higher level characteristics itself. For instance, differences between communi- 
ties might result from a systematic distribution of household characteristics across 
communities, i.e. similar households are spatially concentrated. To see whether 
this is the case, we have to test the proportional change in the variance compo- 
nents, the random intercepts, after accounting for household characteristics, i.e. 
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Table 2.5: Models - 2003 - Fixed effects 


MO 
Household level 
HHsize 
Children Adult 
Youth Adult 
Elderly Adult 
Age 
Sex 
Literate Head 
Literate Adult 
Cotton 
Livestock 
Muslim 
Christian 
Mossi 


М1 


-0.054 
-0.227 
-0.190 


-0.004 
-0.060 
0.272 
0.268 
0.079 
0.034 


жжж 
жж 
ЖЖЖ 


жж 


LIII 
LIII 
LIII 
жж 


М2 M4 


-0.054  *** 
-0.219 
-0.178 


-0.060 
-0.206 
-0.183 


жжж 


LIII 


-0.004  ** 
-0.064 
0.232 
0.2538 
0.117 
0.083 


-0.004 
-0.074 
0.252 
0.212 
0.105 
0.096 


LLLI 
LLLI 
жжж 


жж 


Community level 
ZD Religion 

ZD Ethnicity 

ZD Cotton 

ZD Livestock 

ZD Literate Adult 
ZD Literate Head 
ZD Hhsize 

ZD Children Adult 
ZD Youth Adult 
ZD Elderly Adult 
Electricity 

ZD Urban 
Primary Access 
Secondary Access 
Healthcenter Access 
Market Access 
Provincial level 
Landsize 

Rain 

Pop. Density 
Tarred Road 

Size 


-0.347 
-0.260 * 
0.827 


-0.370 
-0.290 
0.706 


LLLI 


0.138 0.146 


0.088  ** 0.075 


Regional Level 
Ltempmax 
Constant 

AIC 

LR test 

Obs 


11.150 
19143 
0.000 
8488 


Жжжж 


11.780 
16305 


8488 


жжж 


PII 


12.080 11.870 
16132 15976 
- 0.000 
8488 8488 


хк 


Source: Authors’ calculations based on individual dataset 
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Table 2.6: Intraclass correlation coefficients (ICC) 


1994 1998 2003 
мо MI M2 MO МІ М2 M3 MO M1 M2 
Region 9.6% 7.896 3.596 9.396 5.096 1.296 14% 11.1% 6.2% 3.9% 
Province 4.1% 2.1% 1.0% 3.0% 3.1% 2.1% 2.0% 3.3% 6.3% 7.896 


Community 21.9% 15.9% 7.496 26.5% 18.9% 9.096 9.3% 20.5% 15.196 9.096 
Households 64.4% 742% 881% 61.2% 72.9% 87.796 87.3% 65.2% 725% 793% 


Source: Authors' calculations based on individual dataset 


to control for systematic differences in household characteristics across higher 
levels. However, it should be noted, that household characteristics might lie in the 
causal pathway between area characteristics and household income, e.g. better 
and more schools may lead to better education outcomes. Including household 
characteristics will probably lead to an understatement of the importance of area 
characteristics. Hence, it is important to carefully discuss the household level 
variables and the potential influence of area characteristics on these variables. 


2.4.2 Model M1: The role of household characteristics 


In the second step, we add explanatory variables on the household level to the 
random intercept model. We call this model ‘M1’. The results are presented — 
for each year separately — in Tables 2.3 - 2.5. Since we use maximum likelihood 
techniques for estimation, we rely on the Akaike Information criterion (AIC) to 
select the best model. We estimated other versions of M1 with a much larger set 
of potentially important explanatory variables, but present here only those models 
with the lowest AIC. 


Key household level characteristics 


All household variables have the expected sign and are in line with standard re- 
gression results. In particular, household composition has a considerable effect 
on income levels. In terms of per capita incomes, smaller households seem to be 
significantly better off in all years under consideration. The dependency ratios, 
measured via the children (0-6 years) per adult ratio, the youth (7-14 years) per 
adult ratio and the elderly (55 years and older) per adult ratio do all have a signifi- 
cant effect. While young household members lower per capita income in all years, 
the old-age dependency ratio is insignificant in 1994 and 2003 (thus dropped from 
the regression for those years) and negative in the drought year 1998 when food 
prices were extremely high. 

Age of the household head has a significant negative effect on household in- 


come in all years. The household head being a male adult does not seem to play 
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a major role concerning income since its effect is only significantly positive in 
2003. The education of the household head is, as expected, very important in all 
years. Households with a literate head and households with a higher percentage 
of literate adults have on average a higher household income. Ethnicity has no 
influence on household income. Religion does. Belonging to one of the two large 
religious groups in Burkina Faso — Islam and Christianity — has a positive, but 
only barely significant effect on income. 

The effect of cotton farming differs across periods. Cotton farmers were better 
off in 1998 and 2003. In 1994 cotton did not yet have a significant effect. This 
is plausible, since the ‘cotton boom’ set in after the devaluation of the CFA Franc 
in January 1994, enhanced by a very favorable evolution of cotton prices and 
accompanied by a substantial expansion of land used for cultivation. Farmers who 
were also engaged in livestock herding which is often done to diversify risk, and 
hence, to lower the vulnerability to external shocks, were significantly better off in 
2003. However, a deeper analysis of this issue would require to take into account 
the possible endogeneity, since richer farmers are more likely to be engaged in 
livestock herding than poorer farmers. For the latter, the income constraint does 
not allow to buy any livestock. Obviously, it is now interesting to see whether for 
all these household characteristics the effects differ across communities. 


Contribution of household characteristics to spatial variation 


For all years the community and regional variance components decline after the 
incorporation of household level covariates. For the provincial component the 
direction of the change is unstable for the different years which is not surprising 
given the small size and low significance of the provincial random intercept. The 
proportional changes of the community and regional variance components are 
surprisingly stable across survey years (see Table 2.7). Controlling for household 
level characteristics reduces the community variance component by around 50 
percent. 


Table 2.7: Proportional change of variance components 


1994 1998 2003 
М1 М2 М1 М2 M3 М1 M2 


Region -42.5% -62.1% -65.4% -79.6% 15.5% -62.4% -42.3% 
Province -64.0% -60.4% -33.7% -45.0% -3.2% 29.0% 13.2% 
Community -49.1% -61.1% -54.6% -60.5% 3.4% -50.3% -45.8% 
Households -19.1% -0.2% -23.9% -0.2% 0.0% -25.1% -0.2% 


Source: Authors’ calculations based on individual dataset 
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Abstracting from unobserved household characteristics we would conclude 
that 50 percent of the community level variation in income levels is due to a sys- 
tematic distribution of household characteristics across communities while the 
rest is due to community characteristics. Clearly, this is an unrealistic assumption 
and would lead to an overestimation of the importance of area-specific effects. 
Instead we have to consider that household characteristics are itself influenced 
by higher level factors. Levels as well as returns to education, cotton farming 
and livestock herding might be influenced by community characteristics, which 
could be responsible for an underestimation of area importance. Testing the ex- 
planatory power of community characteristics itself is therefore essential to draw 
conclusions on the contribution of community differences on household income 
disparities. 

On the regional level the inclusion of household level variables was also non- 
ambiguous. In 1998 and 2003 observed household characteristics can explain 
about 60 percent of the total unexplained regional variance (40 percent in 1994). 
Given that we controlled for household characteristics to the extent possible, we 
conclude for regions as well that large scale variables have a non-negligible impact 
on household level income. 


2.4.3 Model M2: The role of community characteristics 


To test for the meaningfulness of our results which indicate a high importance of 
community characteristics, we will check the proportional change of the vari- 
ance components after the incorporation of community characteristics (Model 
M2). The remaining significant variation of the community level random intercept 
could be either due to unobserved household characteristics leaving the commu- 
nity random intercept more or less unchanged or due to community characteristics 
(observed or not) lowering the community random intercept towards zero. Again, 
we use the AIC as a model selection criterion and present only the best fits of the 
M2 model (see Tables 2.3 - 2.5). All community variables which were tested for 
significance are listed in Table 2.2. 


Key community level characteristics 


If the community matters, the question is of course which are the relevant factors. 
Tables 2.3 - 2.5 reveal a distinct pattern across the three years. Urban communities 
with a high ethnic fragmentation!®, a high share of literate household heads and 
adults, and access to electricity are better off, on average. Besides the direct effect 


16Ethnic fragmentation is measured as the variance of the shares of each ethnicity in a commu- 


nity. 
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of having a literate adult in the household, there seems to exist a contextual or 
spill-over effect of better educated on less educated individuals within communi- 
ties. However, access to primary and secondary schools — as measured by the time 
needed to reach them — does not turn out to be significant. Education is the only 
household characteristic which appears to have some spill-over effects. Except for 
youth per adult in 1994, all community averages of household level characteristics 
turn out to be insignificant. This is also true for communities with a higher share 
of cotton farmers, even though cotton farmers themselves are better off in 1998 
and 2003, and cotton is always found to be a factor with some contextual effect in 
a community. 

Since we do not have a direct measure of electricity in a community, we coded 
a community to have access to electricity if at least one household in that com- 
munity had access. Electricity might be a good proxy for infrastructure, such as 
access to roads, in a community, since power transmission lines are usually found 
along (gravel) roads. Since at the community level we only have information on 
electricity but not on other infrastructure such as roads, we interpret the positive 
effect of electricity carefully as a general positive effect of community infrastruc- 
ture on household income. Though, access to schools, access to health-centers 
and access to markets turns out to be insignificant. The effect of these kind of 
public services might be, at least to some extent, captured by the significant pos- 
itive effect of urban communities since all these services are usually provided in 
urban areas. 

As mentioned in Section 3, in 1998, the household survey was accompanied by 
a community survey for 325 out of the 425 clusters. This much larger community 
level dataset in 1998 can however only be examined at the cost of loosing a fourth 
of all households in the sample. Hence, we report regression results using data 
for the community survey separately in model M* in Table 2.4. Of all community 
survey variables listed in Table 2.2 only access to a road and to a hospital and a 
high malaria incidence in a cluster affect significantly household income. Signs 
are as expected. These results confirm the findings derived from model M2. Be- 
yond the positive effect of urbanicity, access to markets and schools do not seem 
to play a major role in determining household income. Access to roads however 
— as already suggested by the positive effect of electricity in model M2, which 
we thought to be highly correlated with road access — seems crucial in raising the 
potential for income generation. 


Contribution of community characteristics to spatial variation 


After accounting for community factors, the community variance component re- 
duces significantly in all years (see Table 2.7). Around 60 percent of the remain- 


ing unexplained community level variation in M1 could be explained by observed 
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community factors in 1994 and 1998. In 2003 it was still more than 40 per- 
cent. Although we only have a modest database on community level variables, 
this small set of variables is capable of explaining a significant part of the ob- 
served between community differences. Hence, in addition to simply specifying 
some significant relationship between contextual variables and household income 
as done above, we conclude that these variables are actually responsible for a 
large part of the community level disparity.'” Community endowments have а 
significant effect on household income. 

The variance partitioning does even allow to quantify its contribution to total 
income variation. The very limited set of neighborhood characteristics contributes 
to approximately 7%!® of household income variation. Since we are neglecting 
any measurement error as well as any effects from the community on household 
characteristics this result can be seen as a lower bound of the contribution of com- 
munity characteristics to total income variation. 

However, the question remains whether provincial and regional income dis- 
parities, that were persistent after controlling for household characteristics, are 
actually driven by differences in provincial and regional endowments or whether 
they are mainly driven by differences in community characteristics between these 
areas. Table 2.7 shows that around 60 percent of the remaining regional level 
variation in 1994, 80 percent in 1998 and 40 percent in 2003, can be explained 
by differences in observed community endowments. After the consideration of 
household and community level determinants, less than 5 percent in 1994 and 
1998 and less than 12 percent in 2003 of the remaining total unexplained varia- 
tion is situated at the provincial and regional level together. Here again, it should 
be noted that lower level factors are likely to be driven by macro factors, and, 
hence, we risk to understate the influence of variables on higher aggregation lev- 
els. Moreover, likelihood ratio tests show that both levels still have a significant 
impact. 


2.4.4 Model M3: The role of provincial and regional charac- 
teristics 


In model M3 we incorporate provincial and regional level variables. However, 
except for the 1998 rainfall variable (the drought year), all provincial and regional 
variables turned out to be insignificant. Population density, the density of tarred 
and gravel roads, the average maximum temperature or the variation of rainfall did 


"The remaining unexplained community variation cannot not be dissolved with our data at 
hand. 

'8For instance for the year 1998: ICC(MO)*(1-proportional change of variance compo- 
nent(M1))*proportional change of variance component(M2)=.265 « (1 — .546) x .605 = 7.3 per- 


cent. 
Johannes Grab - 978-3-653-00480-9 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


60 SPATIAL INEQUALITIES EXPLAINED 


not show a significant effect, once household and community level characteristics 
were included. The remaining unexplained variation could not be lowered in any 
of the three years under consideration. Table 2.8 summarizes the contribution of 
observed and unobserved characteristics to the total variance and the variance on 
each spatial level. 


Table 2.8: Contribution of observed and unobserved characteristics on the varia- 
tion on each level 


1994 1998 2003 


Household level 64.4% 612% 65.2% 
Household variables 19.1% 23.9% 25.1% 
Unobserved 80.9% 76.1% 74.9% 
Community level 21.9% 26.5% 20.5% 
Household variables 49.1% 54.6% 50.3% 
Community variables 31.1% 27.5% 227% 
Unobserved 19.8% 17.9% 27.0% 
Provincial level 4.1% 3.0% 3.3% 
Household variables 64.0% 337% «10 
Community variables 21.7% 298% «10? 
Provincial/Regional variables <10 12% — «10? 
Unobserved 14.3% 35.3% >99% 
Regional level 9.6% 9.3% 11.1% 
Household variables 42.5% 654% 62.4% 
Community variables 35.7% 27.5% 15.9% 
Provincial/Regional variables < 107? — 107? < 1073 
Unobserved 21.8% 7.1% 21.7% 


Source: Authors’ calculations based on individual dataset 


The result of insignificant macro-level variables might seem surprising, but it 
is in fact quite consistent with other findings in the literature. Jalan and Ravallion 
(2002) and Benson et al. (2005) do also not find a significant effect of population 
density on household income. Benson et al. (2005) even confirm our result of a 
missing effect of access to roads which is according to Jacoby (2000) the result of 
a low infrastructure elasticity of poverty. 

Burkinabe households seem to have adapted their income generation process 
to the inherent climatical disadvantages in a way that the amount and the variation 
of rainfall in ‘normal times’ does not have a significant impact on their income. 
However, the occurrence of substantial climatic shocks, such as a drought or an 
abnormal distribution of rainfall over the year, do play an important role, as re- 


vealed by the significant positive rainfall coefficient in the drousht ear 1998. 
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Consistent with this result, Benson et al. (2005) find the effect of the amount of 
rainfall on income in Malawi only to be significant when it is exceptionally high. 
Similarly, Dercon (2004) only finds significant effects for Ethiopia when looking 
at severe droughts. 

Our results are also in line with those by Bigman et al. (2000) who conclude 
that regional inequality in Burkina Faso is driven by agro-climatic conditions, and 
disparities between villages are driven by differences in infrastructure. However, 
compared to Bigman et al. (2000), we stress the importance of community char- 
acteristics even more. Our analysis suggests that a large part of regional disparity 
is actually driven by differences in community characteristics between these re- 
gions. Hence, we think the actual impact of agro-climatic conditions is lower than 
suggested by Bigman et al. (2000). 


2.4.5 Model M4: Variations in household level effects across 
communities 


In a next step, we allow household level variables to differ in their impact across 
communities. Thus, in addition to random intercepts, we now also add random 
coefficients (see equation 2.7) at the community level. Covariances of random 
effects are modeled unstructured, i.e. all variances-covariances are distinctly esti- 
mated. We use an iterative procedure to test for significant variance-covariances of 
all significant household level variables included in model M2. We use likelihood- 
ratio tests by estimating the likelihood deviance for the model without the specific 
random effect and for the model with the specific random effect. We keep those 
random effects in model M4 whenever the test-statistic — the difference between 
the deviances of the two models — is significant, і.е. if we get a x? below 5% 
(Goldstein, 2003). In addition, variances and covariances are regarded as insignif- 
icant when their standard error is larger than their estimate (Tseloni, 2006). All 
estimates and their standard errors for model МА are shown in Tables 2.9 - 2.10. 
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Table 2.9: Models - 1994 - Random effects 


мо М1 M2 M4 

Est. Std.Err. Est. Std.Err. Est. Std-Err. Est. Std.Err. 
Variances 
var(region) 0.075 0.047 0.043 0.023 0.016 0.008 0.015 0.008 
var(province) 0.032 0.018 0.012 0.007 0.005 0.003 0.006 0.003 
var(community) 0.171 0.014 0.087 0.008 0.034 0.004 0.091 0.012 
var (household) 0.502 0.008 0.406 0.006 0.406 0.006 0.360 0.006 
var(hhsize) 0.000 0.000 
var(youth adult) 0.018 0.006 
var(liter. head) 0.055 0.013 
Covariances 
cov(hhsize, 0.000 0.001 
youth adult) 
cov(hhsize, -0.002 0.001 
lit. head) 
cov(youth ad, -0.007 0.007 
lit. head) 
cov(hhsize,cons) -0.005 0.001 
cov(youth ad, -0.018 0.007 
cons) 
cov(lit. head, 0.009 0.009 
cons) 

Table 2.10: Models - 2003 - Random effects 

мо М1 M2 M4 

Est Std.Err. Est. Std.Err. Est.  Std.Err. Est. Std.Err. 
Variances 
var(region) 0.085 0.045 0.032 0.021 0.019 0.013 0.019 0.0137 
var(province) 0.025 0.013 0.032 0.012 0.037 0.011 0.041 0.0122 
var(community) 0.157 0.013 0.078 0.007 0.042 0.005 0.066 0.0114 
var (household) 0.502 0.008 0.376 0.006 0.375 0.006 0.350 0.0059 
var(hhsize) 0.001 0.000 
var(youth adult) 0.005 0.004 
var(liter. head) 0.074 0.014 
Covariances 
cov(hhsize, 0.002 0.0007 
youth adult) 
cov(hhsize, -0.003 0.0011 
lit. head) 
cov(youth ad, -0.012 0.0069 
lit. head) 
cov(hhsize,cons) -0.004 0.0011 
cov(youth ad, -0.013 0.0061 
cons) 
cov(liter. head, 0.016 0.0094 


cons) 


Source: Authors’ calculations based on individual dataset 
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Spatially varying household effects 


The results of our analysis are, once again, relatively homogeneous across time. 
We find indeed, that returns associated with education, household size and ef- 
fects related to dependency ratios (children per adult and youth per adult) vary 
significantly across communities in all three years. On the other hand, returns 
associated with age and gender of the household head, with cotton farming and 
livestock herding do not vary significantly across communities in either year. 

The variation of returns across communities is not only statistically but also 
economically meaningful. The fixed effect estimate of the variable ‘literate head’ 
of .26 in 1994 states that households with a literate head have on average a per 
capita income which is higher by 26 percent compared to households with an 
illiterate head. The variance of the random effect of the household head variable 
states however that this return differs significantly between communities. For 
instance for 1994, the effect varies from minus 21 percent (((.26 — 2 х v .055) « 
100)) to plus 73 percent (((.26 + 2 х у .055) x 100)) between the 2.5th and 97.5th 
quantile of Burkinabe communities. Similar variations are stated for 1998 and 
2003. The effects associated with changes in the household composition vary also 
substantially across communities. 


Determinants of spatially varying effects 


We conclude that the community has an influence on effects associated with house- 
hold characteristics, in particular with education. From a policy point of view, it 
is important to know what drives these community effects. In the case of returns 
to education, it might be channeled through unobserved factors like labor mar- 
ket characteristics or the access to modern (agricultural) production technologies. 
These factors will rather be found in better developed communities. However, 
higher returns to education could also be the result – decreasing marginal returns 
to education assumed – of higher marginal effects in some poor and remote com- 
munities. While the former case would rather lead to income divergence across 
communities, the latter could lead to income convergence. 

To get further insights we can calculate the best linear unbiased predictors 
(BLUP) of the random effects and check if variations in returns across communi- 
ties follow a distinct pattern across the 13 agro-climatic regions in Burkina Faso.!? 
We cannot, however, find any evidence for a North-South or East-West pattern in 
returns to education across the 13 regions in any year. The same is true for the 


!9Since the regression coefficients associated with the household characteristics — which are 
random variables at the community level – are directly determined by the observed community 


level factors (see equation 2.3) further regression analysis is not feasible. 
Johannes Grab - 978-3-653-00480-9 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


2.5. CONCLUSION 65 


household size and dependency ratios.” We conclude that returns to these factors 
are driven by small scale community characteristics but not by any regional factor. 

We can also examine the covariance of random effects and random intercepts. 
For the returns to education the covariance between its random effect and the 
random community intercept turns out to be insignificant in 1994 and positively 
significant in 1998 and 2003. On average returns to education are higher in richer 
communities, ceteris paribus. Again, this may point to the impact of unobserved 
community factors on educational returns. As stated above, labor markets are usu- 
ally better developed in richer communities in a sense that they are offering more 
opportunities for a better educated and trained work force. Moreover, modern 
agricultural inputs which may require skilled labor are rather found in richer com- 
munities. Hence, there is little evidence for higher returns to education in poorer 
communities. This is probably due to an only weakly competitive labor market 
and the general low demand for skilled labor in rural areas of poor countries such 
as Burkina Faso. Therefore, we conclude that disparities in returns to education 
cannot explain convergence across districts. 

Regarding the effect of household size, the covariance with the community 
intercept is significantly negative in all years. The same is true for the effects 
associated with dependency ratios; children and youth per adult. This is an in- 
teresting result, stating that an additional household member, at working age or 
not, lowers per capita income more in richer than in poorer communities. In the 
Burkinabe context, it might just show that it is easier for an agricultural than for 
an urban household to feed and sustain an additional household member.?! 


2.5 Conclusion 


The objective of this paper was to analyze the sources of spatial disparities in in- 
come among households in Burkina Faso. We find that about 60 percent of the 
total variance in incomes stems from variance between households, 20 percent 
from the variance between communities, less than 5 percent from the variance be- 
tween provinces and about 10 percent from the variance between (agro-climatic) 
regions. Within each level community characteristics play a very important role. 
In particular our findings suggest that communities and provinces are not only 
poor because the households which live there are poor but also because the en- 
dowments of these communities are very weak (and vice versa for rich communi- 
ties). Differences in observed community characteristics account also for a large 
part of the regional variation. Hence, community characteristics matter. 


Results can be found in the appendix (Figures A.8-A.16). Codes are defined in Table A.9. 
21 Our finding of a negative covariance between intercepts and household size and dependency 


ratios do also hold when Ouagadougou is dropped from the regression. 
Johannes Grab - 978-3-653-00480-9 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


66 SPATIAL INEQUALITIES EXPLAINED 


We also find that the effects associated with household’s education and their 
size and composition are community-specific. For instance, we find higher re- 
turns to education in the rather richer communities. In contrast, returns to cotton 
farming and livestock herding are more or less constant across these spatial units. 

One may tend to conclude from our analysis that poverty alleviation policies 
should intervene at the community level, since at that level we identify the most 
important source of variance, and hence interventions at the regional or national 
level would risk to waste resources. However, political and institutional con- 
straints might make it difficult to intervene at that level. This has to be studied 
case by case. 

Finally, it should be noted that our analysis is constrained by the limited avail- 
ability and the modest quality of data at the different spatial levels. In Burkina 
Faso, as well as in many other developing countries, community surveys are miss- 
ing. Geo-referenced data is also often not available. However, as we show, small- 
scale area data is key to understand and tackle spatial disparities in income. 


Johannes Grab - 978-3-653-00480-9 
Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


Essay 3 


Low Malnutrition but High 
Mortality: Explaining the Paradox 
of the Lake Victoria Region 


Abstract: Exploiting DHS data from 235 regions in 29 Sub-Saharan Africa coun- 
tries, we find that the combination of low levels of malnutrition together with dra- 
matically high rates of mortality, encountered in Kenya’s Lake Victoria territory, is 
unique for Sub-Saharan Africa. This paper explores the causes of this paradox for 
the Kenyan context. Our identification strategy consists of two parts. First of all, 
we apply multilevel regression models to control simultaneously for family and 
community clustering of the observed malnutrition and mortality outcomes. Sec- 
ondly, to address unobserved but correlated factors, we exploit information from 
GIS and malaria databases to construct variables that capture additional compo- 
nents of children’s geographic, political and cultural environment. Our analysis 
reveals that beneficial agricultural conditions and feeding practices lead to the ob- 
served sound anthropometric outcomes around Lake Victoria. In contrast, high 
mortality rates rest upon an adverse disease environment (malaria prevalence, wa- 
ter pollution, HIV rates) and a policy neglect (underprovision of health care ser- 
vices). Nonetheless, a significant effect of the local ethnic group, the Luo, on 
mortality remains. 


based on joint work with Jan Priebe. 
Johannes Gräb - 978-3-653-00480-9 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


68 Low MALNUTRITION BUT HIGH MORTALITY: THE PARADOX 


3.1 Introduction 


There are a small number of regions in Sub-Saharan Africa that show very high 
levels of mortality given the anthropometric status of their population. Among 
these regions Nyanza, the principal Kenyan Lake Victoria province, is an excep- 
tion of its own. In no other region in Sub-Saharan Africa (SSA) is the pattern 
of low levels of malnutrition together with dramatically high rates of mortality 
as pronounced as in Nyanza. Furthermore, the unique position of Nyanza is not 
only puzzling for Sub-Saharan Africa, but as well in the Kenya specific context. 
While Nyanza ranges on the upper limit of the mortality scale, most other Kenyan 
provinces depict comparatively low mortality rates given their levels of malnutri- 
tion. 

In this paper we investigate the role of cultural, geographic, and political fac- 
tors on the relationship of anthropometric outcomes of children and under-5 mor- 
tality rates in Kenya with an explicit focus on the unique situation of Nyanza and 
the territory around Lake Victoria. In order to disentangle the underlying mech- 
anism that lead to the observed outcomes we analyze the factors driving mortal- 
ity, stunting, and wasting jointly. Since parameter estimation can seriously suffer 
from endogeneity problems we adopted 3 strategies to mitigate this problem. First 
of all, we estimate reduced form regressions and therefore exclude any explana- 
tory variable that we would expect to cause problems of simultaneous causality. 
Secondly, we augmented our DHS data by generating appropriate variables on 
malaria, health provision and Lake Victoria in order to mitigate problems arising 
from potential omitted variable bias. Thirdly, we use mixed model representations 
to further address unobserved heterogeneity issues on the family and community 
level. 

Our findings point to a unique interaction of cultural, geographic and political 
factors in the Lake Victoria region which are responsible for causing the described 
paradox. Particularly, high mortality rates are found to rest upon the disease envi- 
ronment in the territory in combination with unfavorable cultural habits of the lo- 
cal ethnic group with respect to sexual, and pre- and post natal behavior. Political 
discrimination against this group resulting in reduced access to health infrastruc- 
ture further exacerbates the mortality situation in the region. Nonetheless, even 
after controlling for other factors a significant ethnic specific influence on mor- 
tality remains although the effect is much smaller than found in previous studies. 
On the other hand, the area around Lake Victoria displays extraordinary positive 
conditions - fertile soils, a high level of food security and high protein availability 
(fish) - that contribute to children’s advantageous nutritional outcomes. 

In this regard the existing study adds an important new example and further 
insights to the few existing mortality-malnutrition paradoxes. For instance, the fa- 


mous and widely investigated South Asia vs. Sub-Saharan Africa enigma (Rama- 
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lingaswami et al., 1996; Svedberg, 2000; Harttgen and Misselhorn, 2006; Klasen, 
2008) refers to the observation that anthropometric outcomes of children are on 
average much better in SSA than in South Asia, while on the other hand child mor- 
tality rates are significantly higher in SSA compared to South Asia.! Our study 
contributes to this literature for a variety of reasons. First of all, we are the first 
to explicitly state the Lake Victoria paradox and to analyze it comprehensively. 
Secondly, the study illustrates that not only in the context of potential large ge- 
netic differences (Klasen, 2008) such a paradox can arise, but that an interaction 
of cultural, geographic and political factors can reverse the positive relationship 
between a good nutritional status and the survival chances of children. Thirdly, 
in contrast to previous empirical studies on the South Asia vs. SSA enigma we 
explicitly control for factors related to the disease environment, e.g. HIV and 
malaria, and cultural factors and therefore are in a better position to obtain unbi- 
ased coefficients of our estimates. Fourthly, we use recent advances in multilevel 
modeling techniques that allow for the estimation of 3-level models which en- 
ables us to separate effects working at the individual, household, and community 
level. Moreover, this study is to our knowledge the most complete and accurate 
one analyzing the current determinants of under-5 mortality and anthropometric 
outcomes in Kenya. 


Examining the nature of the Nyanza anomalies is further interesting and im- 
portant since this example seems to question main findings in a variety of aca- 
demic disciplines, most notably epidemiology, health economics, labor economics 
and economic history. In epidemiology, malnutrition, in particular wasting, low 
weight-for-height, is considered to be the main driver of mortality in developing 
countries (Villamor et al., 2005; Fawzi et al., 1997) and historical Europe (Fo- 
gel, 1994). Pelletier et al. (1995) claim that wasting, which is positively related 
to mortality due to diarrhea, fever and breathlessness, is the underlying cause of 
more than 50% of all child deaths in the world. Moreover, Caulfield et al. (2004) 
find that a sound nutritional status of children lowers the likelihood to die from 
malaria while Villamor et al. (2005) show that among HIV infected children the 
risk of dying in early ages was significantly higher for children being wasted. 
Thus, children who suffer from malnutrition have a significantly higher risk ex- 
posure to mortality due to a lower resistance to illnesses. This individual relation- 
ship is usually assumed to hold even on a higher aggregate level. Thus, areas with 
high prevalence of malnutrition rates are expected to have higher mortality rates. 
Regarding the Kenyan context we find the exact opposite pattern. While rates 


!In another mortality-malnutrition paradox Williamson (1990) finds that during the British 
industrial revolution the population in urban areas suffered from much higher mortality rates than 


the rural population despite their much better anthropometric status. 
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of wasted children gradually decrease with proximity to Lake Victoria, mortality 
rates steadily increase reaching its peak in Nyanza. 


In health and labor economics the interest of studying the growth process of 
children has more recently been motivated by the findings that taller populations 
are economically better-off, more productive, and live longer (Bozzoli et al., 2009; 
Deaton, 2008). This result can partly be attributed to the prevailing disease envi- 
ronment. If on the one hand the child growth process and therefore adult heights 
and on the other hand life expectancy are positively correlated with the lack of cer- 
tain diseases then for instance a simple Beckerian type of quantity-quality trade- 
off models can explain higher incomes of taller populations through the human 
capital formation process. The relationship between adult height and economic 
well-being might therefore only be valid if adult height is at the same time a good 
proxy for the mortality environment of a certain area or a country. In the absence 
of this later condition the relationship might be seriously flawed and this is what 
we partly observe in the Kenyan context. 


Furthermore, there exist several studies in the field of economic history that 
make inferences about economic conditions during a specific time period based 
on mean height measures of population or population subgroups. Since height is 
expected to be an increasing but concave function of income, average height will 
be negatively correlated with initial income inequality (Steckel, 1995). Hence, 
income inequality has an effect on the dispersion of heights, so that inequality in 
height might function as an indicator of income inequality in the absence of data 
on the latter one, while mean height might serve as an indicator of mean income 
(Deaton, 2008). Obviously this inference does only hold if there are no other third 
factors that alter the relationship in an important way. Although this point has 
been recognized in the relevant literature (Deaton, 2008), it is often neglected or 
downplayed due to a lack of data that can function as control variables. Again the 
case of Nyanza and Lake Victoria illustrates that such an inference can simply go 
wrong. 


The paper is structured as follows: Section 3.2 amplifies and describes the ex- 
tent of the paradox of Nyanza and Lake Victoria in the Kenyan and Sub-Saharan 
African setting with respect to stunting, wasting and under-5 mortality. Section 
3.3 discusses the theoretical model and outlines the identification strategy. Section 
3.4 provides a detailed literature review on the most relevant cultural, geograph- 
ical, and political particularities of the Kenyan context in order to explain the 
construction and interpretation of additional variables not included in DHS sur- 
veys. Section 3.5 provides descriptive statistics on the data sets and variables used 
in this study and comprises the multivariate analysis. Section 3.6 summarizes and 


concludes. 
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3.2 The Paradox 


Evidence from the health and demographic literature suggests that there exists 
a clear positive relationship between a good nutritional status of a child and its 
chances of survival. From this observation it is usually inferred that this indi- 
vidual relationship holds as well at a higher aggregate level. Hence, regions or 
countries that perform well in terms of anthropometric indicators should exhibit 
lower mortality rates and vice versa. 


However, the validity of this inference seems to be seriously challenged con- 
sidering the spatial distribution of malnutrition and mortality in Kenya. Compar- 
ing the anthropometric and mortality outcomes of children in the Nyanza province 
to the other Kenyan provinces two things are salient. First of all, children in 
Nyanza score very well regarding anthropometric indicators, while infant and 
child mortality rates are extremely high in the region. Secondly, the mere extent of 
the within country variation in mortality rates astonishes. The extraordinary situ- 
ation of Nyanza or more precisely the Lake Victoria region in the Kenyan context 
has been noted already for a long time. An investigation of 16 villages in 1922 in 
the area north of Lake Victoria showed an infant mortality rate between 335 and 
514 per 1,000 live births with a tendency of declining infant mortality with dis- 
tance to the lake (Colony and Protectorate of Kenya, 1923). Meanwhile historical 
data on adult mean height for the same time period suggests that the ethnic group 
living on the shores of Lake Victoria were the tallest in all over Kenya (Moradi, 
2009). 


Interestingly, although some authors occasionally have mentioned either the 
high mortality rates or the favorable anthropometric outcomes on the shores of 
Lake Victoria, to our knowledge no study exists that combines these two findings. 
Therefore, the puzzling situation on the shores of Lake Victoria has not yet been 
stated as such in the literature. Moreover, it has not been clear whether the ob- 
served mortality pattern combined with favorable anthropometric outcomes on the 
shores of Lake Victoria is even unusual for a larger geographical context. Hence, 
in order to assess the peculiarity of the Kenyan Lake Victoria region in the Sub- 
Saharan Africa setting we compiled a data set gathering information on anthropo- 
metric (prevalence of stunting and wasting) and mortality indicators (under-5 mor- 
tality rate) on regional level for all SSA countries where appropriate Demographic 
and Health Survey (DHS) data was available. If more than one DHS round was 
available for a country, we chose to only take the latest round into consideration. 
Furthermore, since the child growth reference standard which is used to calculate 
z-scores had changed rather recently, anthropometric statistics in our sample ob- 
tained from the official DHS reports would be based on two different standards. 


For the sake of comparison we recalculated stunting and wasting prevalence rates 
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using the new WHO child growth reference standard for all regions (WHO, 2006). 
The final data set comprises 235 regions in 29 SSA countries. 

Table 3.1 depicts under-5 mortality, stunting, and wasting rates for all regions 
with an under-5 mortality rate of 200 per 1,000 live births or more. Of these 36 re- 
gions only Tambacounda province in Senegal shows better stunting rates than the 
Nyanza province while with respect to wasting only Tete and Niassa provinces in 
Mozambique show lower prevalence rates. Moreover, it is interesting to note that 
all regions in the table that achieve prevalence rates in one of the anthropometric 
indicators similar to those in Nyanza, score much worse than Nyanza in the other 
anthropometric indicator. Therefore, Nyanza seems to be the only region in SSA 
that scores extremely well in stunting and wasting given the level of mortality. 

Even more striking is the result when focusing on the area in close proximity 
to Lake Victoria. Using the 2003 DHS round for Kenya jointly with the provided 
GIS data we calculated under-5 mortality rates and the two anthropometric indi- 
cators for all children born within a distance of 20km from Lake Victoria. In this 
area under-5 mortality rates strongly increase by approximately 50% to 306 per 
1,000 live births compared to the Nyanza average while stunting rates even fall to 
26.6% with wasting rates increasing slightly to 3.9%. Furthermore, the extreme 
position of the Lake Victoria region and Nyanza can be nicely illustrated as in 
Figure 3.1 which depicts the bivariate relationship between stunting and under-5 
mortality for all 235 regions and the Lake region as defined above. Figure 3.1 
underscores the unusual high mortality level of Nyanza given its stunting rates 
and much more important the unique position of the Lake area in the SSA context 
with an overwhelmingly high under-5 mortality rate given the level of stunting. 
A similar conclusion can be derived from Figure 3.2 which presents the bivariate 
relationship between wasting and under-5 mortality. 

Besides highlighting the extraordinary situation of the Lake Victoria region, 
all 3 figures show further puzzling results. In particular, the within country distri- 
bution of under-5 mortality rates is remarkable. While Nyanza is situated far on 
the upper bound of the under-5 mortality rates given its level of stunting or wast- 
ing, several Kenyan provinces find themselves on the opposite side, showing very 
low mortality rates given its level in the respective anthropometric indicators. This 
high divergence of mortality levels within one country is highly unusual even for 
the SSA context. Column 9 in Table 1 presents the coefficient of variation (CV) 
based on the separate calculation for each country. Out of the whole sample Kenya 
shows the highest values in the CV among all 29 countries indicating the highest 
level of dispersion given its level of mortality. 

Interestingly, the Lake Victoria provinces of Uganda and Tanzania do not ex- 
hibit such an unusual pattern as Nyanza. While anthropometric outcomes for chil- 
dren are slightly worse in these provinces compared to Nyanza, under-5 mortality 


rates are substantially lower. Since geographical and epidemiol ogical co conditions 
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Table 3.1: Mortality and undernutrition rates in the SSA context 


Country 
Senegal 
Zambia 
Burkina Faso 
Chad 
Cameroon 
Senegal 
Mozambique 
Kenya 
Mozambique 
Mozambique 
Guinea 
Ghana 
Guinea 
Burkina Faso 
Burkina Faso 
Niger 

Niger 
Guinea 
Mozambique 
Malawi 

Mali 

Mali 

Mali 
Burkina Faso 
Niger 
Ruanda 

Mali 

Chad 
Mozambique 
Zambia 
Chad 
Nigeria 

Mali 

Nigeria 
Niger 
Burkina Faso 


Region 
Tambacounda 
Western 
Sud-Ouest 
Zone8 

Nord 

Kolda 
Sofala 
Nyanza 
Tete 

Niassa 
Kankan 
Upper West 
Kindia 
Cascades 
Centre-Ouest 
Tahoua 
Dosso 
N'Zérékoré 
Nampula 
Mulanje 
Koulikoro 
Mopti 
Tombouctou 
Nord 
Maradi 

East 

Sikasso 
Zone5 

Cabo Delgado 
Luapula 
Zone7 
North East 
Segou 
North West 
Zinder 
Sahel 


Year 
2005 
2001 
2003 
2004 
2004 
2005 
2003 
2003 
2003 
2003 
2005 
2003 
2005 
2003 
2003 
2006 
2006 
2005 
2003 
2004 
2006 
2006 
2006 
2003 
2006 
2005 
2006 
2004 
2003 
2001 
2004 
2003 
2006 
2003 
2006 
2003 


Mortal. 


200 
201 
203 
204 
205 
205 
205 
206 
206 
206 
207 
208 
211 
211 
213 
214 
215 
218 
220 
221 
222 
227 
229 
231 
231 
233 
237 
240 
241 
248 
256 
260 
262 
269 
269 
285 


Stunt. 


28.7 
49.7 
47.7 
35.2 
49.6 
39.6 
48.6 
33.4 
54.8 
50.5 
46.1 
36.3 
39.2 
45.8 
43.3 
51.3 
46.7 
45 
47 
53.1 
39.1 
40.9 
43.9 
41 
66.4 
47.5 
45.2 
45.6 
63.2 
63.9 
40.5 
48 
40 
59.5 
65.1 
53.9 


Wast. Range 
11.2 -126 
3.3 -118 
25.7 -166 
13.9 -122 
8.4 -130 
9.3 -126 
7.4 -152 

3 -152 
2.5 -152 
2.4 -152 
15.6 -126 
15.4 -133 
82 -126 
30 -166 

20.2 -166 
12.9 -158 
11.5 -158 
11.9 -126 
9.6 -152 

8 -109 
16.2 -154 
12.7 -154 
16.5 -154 

24.6 -166 
14.3 -158 
4.6 -109 
15.8 -154 
15.9 -122 
4.8 -152 
5.3 -118 
11.3 -122 
10.8 -166 
14.6 -154 
14.6 -166 
15.9 -158 
21.2 -166 


Source: Authors’ calculations based on latest DHS surveys of respective countries 
Note: CV relates to the coefficient of variation and Range to the difference between the minimum 
and maximum value within a country. Both measures refer to under-five mortality rates. 
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CV 
0.31 
0.22 
0.22 
0.20 
0.27 
0.31 
0.28 
0.40 
0.28 
0.28 
0.21 
0.34 
0.21 
0.22 
0.22 
0.29 
0.29 
0.21 
0.28 
0.18 
0.25 
0.25 
0.25 
0.22 
0.29 
0.20 
0.25 
0.20 
0.28 
0.22 
0.20 
0.36 
0.25 
0.36 
0.29 
0.22 


LOW MALNUTRITION BUT HIGH MORTALITY: THE PARADOX 


Figure 3.1: Stunting rates in SSA 
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Figure 3.2: Wasting rates in SSA 
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seem to be similar among all provinces bordering Lake Victoria, one possible 
explanation for this finding points to the role of the government and racial cleav- 
ages affecting migration decisions and the provision of health care. Compared to 
Uganda and Tanzania internal migration on provincial level in Kenya is limited 
by the prevalence of strong ethnic reservations. Furthermore, as discussed under 
section 3.4.3 discriminatory practices against the local ethnic group seems to have 
led to an underprovision of health care services in Nyanza compared to most other 
Kenyan provinces. In contrast, in Uganda the capital Kampala is situated in close 
proximity to Lake Victoria and therefore we would not expect an underprovision 
of health care services on the Ugandan side of Lake Victoria. Regarding Tanzania 
ethnic based inequalities in the provision of health care services are rather low due 
to the pursued nation building policies and hence health care provision seems to 
be much more need oriented (Miguel, 2004). 

The simultaneous appearance of very low levels of malnutrition together with 
tremendously high rates of mortality in Nyanza and in particular the Lake Victo- 
ria region is unique in the SSA context. Moreover, the appearance of this phe- 
nomenon in a national context of relatively low mortality rates is further puzzling 
and led us to call it "The Paradox’. Trying to explain this paradox is the objective 
of this paper and will demand a detailed review of the particularities of the Kenyan 
context. 


3.3 Theoretical Framework 


In order to analyze the described paradox we estimate reduced forms of child 
health and mortality production functions. The choice of our theoretical model 
relies on earlier work in this field done by Akin et al. (1992); Rosenzweig and 
Wolpin (1988) and in particular Behrman and Deolalikar (1988). An overview 
on the general relationships of underlying (exogenous) and proximate factors af- 
fecting health and mortality outcomes is presented in Figure 3.3 which is guided 
by the frameworks as outlined in Mosley and Chen (1984) and UNICEF (2008). 
The conceptual core of the framework is the idea that all background variables 
(cultural, socioeconomic or geographic) have to operate through a limited set of 
proximate determinants (environmental contaminations, maternal factors, infant 
feeding habits, and preventive health care practices) which in turn directly influ- 
ence the risk of disease and the outcome of the disease process. 

The relationship among health inputs and child health outcomes can be written 
as follows: 


Њу = H(E?,P? vk, Uik, Ei 3.1 
ik = HEP Pr УКИНЕ отвьз.658.004809 1) 
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Figure 3.3: Theoretical Framework 
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where the health of individual i in household j and community k (H; jx) is produced 
by observed underlying factors (27), observed proximate factors (P?) and certain 
unobserved underlying and proximate factors (vj, U jx, Eijk). 

Further on, equation 3.2 depicts the mortality production function for indi- 
yea i, with mortality (Мик) resulting if health falls below some critical level 
H*. 

Mijx = M(Hijk— Н") (3.2) 


In the empirical analysis, the estimation of parameters for the health and mor- 
tality production functions can suffer from three types of problems. First of all, 
since all health related input variables are treated as exogenous, a bias might arise 
if we fail to control for simultaneously determined health inputs in the estimation 
of the health production function. Secondly, health and mortality outcomes are 
influenced by several individual, family, and community variables. Some of these 
variables can be observed; others cannot. Table 3.2 shows how the observed and 
unobserved factors can be classified in our particular case. The simple association 
between, for example, the stunting score of a child and the mother’s educational 
level holds if the observed indicator (e.g. educational status) is not correlated with 


?Empirically we will model M as the risk of mortality at time t. 
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Table 3.2: Classification of Variables Influencing Health and Mortality Outcomes 


Variables Observed by analysts Unobserved by analysts 
Individual Health indicators (anthropometric and, Genetic endowment, 
mortality outcomes, reported diarrhea) HIV status, 
Health related practices (birth interval, Nutritional intake 


Clinical birth, caesarian section, duration 
of breastfeeding, Retro-viral drugs 
Death of previous child, vaccinations) 
Age, gender, twin status 


Family Mother’s education in years, Genetic factors, Innate ability 
Mother’s age at birth, marital status, for child care, Parental time 
Mother’s BMI and HIV status, devoted to child care, General 
Household size, asset possession, knowledge and mental capability, 
Ethnic belonging, Intra-household resource allocation, 
Water and sanitation access Income and expenditure levels 

Community Geographic location General disease and health 

environment, Public Infrastructure 

Rural or urban Labor market conditions 


(availability and quality of education 
and health care, facilities, roads,...), 
Water quality, Cultural habits 


District Malaria suitability Political factors 
Province Health access (People per Physician, Political factors 
Health expenditures per capita) 


unobserved variables (such as labor market conditions) that affect the stunting 
score. However, if the unobserved factor affects the child’s stunting status and is 
correlated with the educational attainment of the mother, the estimated effect is 
biased together with false standard errors for our parameter estimates. Thirdly, 
our obtained coefficients can be biased and our standard errors can be false, even 
if the unobserved factor affects only the outcome variable but is completely uncor- 
related with the observed explanatory variables. This might be the case if in the 
incidence of clustering the mortality risk among siblings and among children re- 
siding in the same community is partially due to children sharing the same family 
and community characteristics. However, the correlation may persist after con- 
trolling for observed factors such that the remaining correlation is a consequence 
of genetic, behavioral, and environmental factors that are common to all children 
in a particular community or family but that are unobserved. As a consequence, 
the still correlated observations violate a standard assumption of independence in 
statistical analyses, resulting in standard errors that are understated and, in the 
case of non-linear models such as a hazard models, parameter estimates that are 


both biased and inconsistent (Trussel and Rodriguez, 1990). 
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To mitigate the problems we adopted 3 strategies. First of all, in order to 
circumvent the first problem we estimate reduced form regressions and therefore 
exclude any explanatory variable that we would expect to cause such a problem, 
which in our case results in dropping the length of breastfeeding from our regres- 
sion equations. Secondly, we construct and include malaria, health care and a 
Lake Victoria variable, indicating whether a child lives within 20km around Lake 
Victoria in order to better capture the specific nutrition and disease environment 
as explained in more detail in section 3.4. Thirdly, as explained more in detail 
in section 3.5, we use mixed model representations to further address unobserved 
heterogeneity issues on the family and community level. 


3.4 Geography vs. Ethnicity: The Kenyan Context 


The seeming disconnection of anthropometrical indicators on the one hand and 
health and mortality patterns on the other and the prevailing huge spatial differ- 
entials in mortality and undernutrition indicators in Kenya requires a profound 
investigation of the underlying causes of this phenomenon. Moreover, in order 
to facilitate the understanding of the country specific context it is further use- 
ful to distinguish between underlying causes that have an effect on either child 
malnutrition or child mortality and those that have an effect on both of them si- 
multaneously. Despite reviewing all of the relevant causes in this section we will 
only pay attention to those that are particular to the Kenyan context with a main 
focus on nutritional, epidemiological and cultural factors. 


3.4.1 Nutritional Environment 


Food and nutrition availability affects anthropometric and mortality outcomes 
likewise. While insufficient food and vitamin intake mostly influences child mor- 
tality indirectly by increasing the predisposition to diseases it often expresses itself 
directly in anthropometric measures which therefore frequently serve as proxies 
of the health and mortality environment in the absence of reliable data on the lat- 
ter ones. The two most widely used anthropometric indices regarding children 
are stunting (low height for age) and wasting (low weight for height) which both 
serve a different purpose. Stunting is claimed to be an indicator of chronic under- 
nutrition resulting of prolonged food deprivation or illness, meanwhile wasting is 
supposed to reflect acute undernutrition as a result of more recent food deprivation 
or illness (Nandy et al., 2005). 

In addition to this there exist further factors that manifest themselves in both 
anthropometric indicators very differently. In particular the type of food con- 


sumed in the first months of life plays an important role in the growth process of 
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children and hence affects the stunting indicator while the wasting indicator re- 
mains relatively unchanged. For the transformation of energy into body growth, 
certain micro- and macronutrients are very essential like iron, calcium, iodine, vi- 
tamin A or proteins (Moradi, 2009). Accordingly, the body on the one hand fails 
to grow at a normal rate as a result of chronically insufficient intakes of certain 
types of micro- and macronutrients and on the other hand exceeds normal growth 
rates if certain micro- and macronutrients are consumed in high amounts. An im- 
portant food that makes a vital contribution to the survival, health and body growth 
of children is fish which provides quality proteins and fats (macronutrients) and 
vitamins and minerals (micronutrient). Furthermore, it is notable that not fish 
consumption per-se drives the higher than average growth process. Biomedical 
research shows that the growth effect due to fish consumption does only seem 
to occur in combination with a well balanced compositional diet (Marques et al., 
2008). 

Neglecting the role of diseases on stunting and wasting we would expect that 
both indicators show high z-values in areas where food availability is high. More- 
over, the incidence of stunted children should be particularly low in those regions 
where fish is widespread available together with other crops and foods. 

The Lake Victoria region in fact offers the advantageous conditions just men- 
tioned. Soils are mostly of good quality resulting in agricultural production sur- 
pluses what led Fearn (1961) to call the region the ’granary of East Africa’. More- 
over, fish is largely available in at least close proximity to the lake despite the 
strong export orientation of the fishing industry in the context of the nile perch 
boom. Moreover, the region is situated on the trade route between Tanzania and 
economically comparatively prosperous Central Kenya which might further help 
to increase the variability and availability of food in the region. 

In general reliable administrative data on soil quality, frequency of rain and 
agricultural production for Kenya is not available. In the absence of these data 
it is difficult to asses the food availability and security situation for other regions 
in the country. Fortunately, USAID created the Famine Early Warning System 
(FEWS) that issues early warning and vulnerability information on emerging and 
evolving food security issues in the world. To inform researchers and policy mak- 
ers FEWS generates the so called Water Requirement Satisfaction Index for maize 
(WRSI) and the Normalized Difference Vegetation Index (NDVI). Both indices 
are updated regularly and allow for the investigation of intra-country differences 
(fews.net). The WRSI for Kenya is used as an indicator of maize performance 
based on the availability of water to the crop during the growing season. Maize 
has been selected since it is the most important cereal crop in Sub-Saharan Africa 
and due to its properties to be cheaper, less water intensive and climatical more 
robust than other cereals which makes the WRSI an ideal indicator of food secu- 


rity. Looking at the distributional map of the WRSI for Kenya Figure 3.4, two 
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things are remarkable. First of all, the areas close to Lake Victoria obtain the 
highest scores for the whole country indicating a very food secure situation. Sec- 
ondly, the level of food security deteriorates steadily the further a location is away 
from Lake Victoria an exception being the coastal area around Mombasa where 
the food security situation improves again (FEWS-Net, 2004). This result is rein- 
forced by considering the map portrays of the NDVI for Kenya, Figure 3.5, which 
is based on meteorological NASA satellites using advanced very high resolution 
radiometer in order to indicate the vigor and density of vegetation at the earth’s 
surface. An inspection of the map shows a similar pattern with the areas around 
Lake Victoria showing the highest vegetation density and areas further away de- 
picting continuously decreasing vegetation levels (FEWS-Net, 2004). Relying 


Figure 3.4: Water Requirement Satisfaction Index (WRSI) 


1 Eo es в Бы 
ma ме =: 


Source: FEWS-Net (2004) 


on the two proxies for food security described above, two main implications can 
be derived from the previous considerations. Due to a general increase in food 
security and food availability the closer one gets to Lake Victoria, wasting and 
stunting indicators should show improvements reaching their lowest values in the 
area around Lake Victoria. Furthermore, the incidence of stunting should be ex- 
traordinary low on the shores of Lake Victoria since fish is widely available as a 


staple food around this area. 
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Figure 3.5: Normalized Difference Vegetation Index - Kenya 


Source: FEWS-Net (2004) 


3.4.2 Epidemiological Factors 


Child mortality levels in Kenya declined rapidly after its independence in the early 
1960s and reached its minimum levels in the late 1980s. From then on the trend 
reversed and despite a significant drop in overall fertility rates, mortality levels 
were continuously increasing up to 115 per 1,000 in the 2003 DHS round (Hill et 
al., 2004; CBS, 2004). This adverse trend was accompanied by stagnant growth 
of per capita income, declining levels of immunization, falling school enrollment, 
and foremost the emergence of the AIDS epidemic (Hill et al., 2004). One of the 
salient findings of the 1998 and 2003 DHS rounds is the enormous variation of 
child mortality rates among the provinces reaching 54 per 1,000 in the Central 
Province and 204 per 1,000 in Nyanza in the 2003 round. 

In the following we want to shed some light on the principal underlying causes 
of the geographic differentials in the observed child mortality rates in order to 
underscore the role of Lake Victoria on mortality patterns in Kenya. 

When looking at the data for Nyanza it surprises that the established link be- 
tween a good nutritional status, in particular wasting or acute malnutrition, and 
child mortality risks is not reflected in the health statistics. While children in the 
Nyanza region show above average scores in the anthropometric indicators, infant 


and child mortality rates are highest in the region. A first major reason that helps 
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to explain this paradox is the unequal distribution of malaria prevalence within 
the country - as depicted in Figure 3.6 - due to regional variations in temperature, 
humidity, and the existence of bodies of water. Although malaria is epidemic in 
several areas in Kenya, the Lake Victoria region is the only endemic region in the 
country with a transmission period that lasts over the whole year (MARA, 2004). 
Moreover, it is important to note that the risk of malaria infection does not decline 
in a continuous way starting from Lake Victoria. Due to the elevations of the East 
African rift valley arising only some kilometers away from Lake Victoria a natural 
malaria barrier exist in the east that drastically reduces the risk of malaria infection 
in these regions. In all other regions, except a small stripe on the coastal area, cli- 
matical conditions do not favor the reproduction of female anopheles mosquitoes 
over the whole year due to long periods without rain in these areas. Hence, this 
restricts malaria transmission rather to the rainy seasons in these areas. 


Figure 3.6: Climate Suitability for Endemic Malaria 


qm 


Source: MARA (2004) 


A second major aspect that affects health and mortality outcomes of young 
children is the quality of drinking water. Despite being the second largest fresh 
water lake in the world, the water of Lake Victoria is not safe for drinking and sev- 
eral cases of outbreaks of waterborne diseases are reported each year (Ochumba 
and Kibaara, 1989; Oguttu et al., 2008; Omwega et al., 2003; Scheren et al., 2000). 

The third major difference of the Lake Victoria region compared to other re- 


gions in Kenya is the high prevalence of HIV/Aids in the area. While cultural 
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factors, as described later on, can explain part of the high HIV rates in the area, 
recent studies point to the social erosion of family norms among people around 
the shores of Lake Victoria. The nile perch boom in the area, starting in the mid 
1990s, and the resulting demand for male labor forces in the fishing industry led to 
a strong influx of migrants into the region which was accompanied by a growing 
prostitution business (Geheb et al., 2008). Moreover, the increase in demand for 
male labor shifted the intra-household bargaining power towards men and con- 
tributed to weaken the already inferior position of women thereby increasing the 
likelihood of involuntary risky sexual behavior for women (Béné and Merten, 
2008). Further on, Nyanza province is situated on thriving trade and migration 
routes connecting the economically powerful central area of Kenya with Tanzania. 
Together with the high urbanization rates in the Nyanza this is likely to contribute 
to the higher HIV/Aids rates in the area (Oster, 2008). 

Taking into account the spatial distribution of mortality drivers as outlined above, 
we would expect strongly increasing mortality rates in close proximity to Lake 
Victoria the main reasons being among others the comparatively high HIV/AIDS 
prevalence and the much stronger predisposition to infectional diseases like Malaria 
in that area. 


3.4.3 Cultural Factors 


Ethnic belonging affects mortality and undernutrition levels in Kenya through a 
variety of mechanism. Notably geographic and cultural factors in addition to the 
prevalent political economy play an important role with respect to health outcomes 
of certain ethnic groups in the country. 

Since most ethnic groups in Kenya live spatially concentrated in a very par- 
ticular region of the country, current administrative provincial boundaries were 
usually drawn based on the location of a certain ethnicity. The Lake Victoria re- 
gion is part of the Nyanza province which is predominantly inhabited by the Luo 
ethnic group. Although Luo have several cultural practices in common with most 
other ethnic groups in Kenya, there exist three noteworthy differences. 

Firstly, while most ethnic groups in Kenya practice male circumcision, Luo 
besides Turkana, and Itero, which represent only a small part of the population, 
are known for not being circumcised thereby substantially increasing their risk 
of HIV infection and HIV related mortality (Chesoni, 2006). Secondly, the type 
of nutritional intake differs compared to other ethnic groups. Having lived for 
already more than 400 years in close proximity to Lake Victoria (East African 
Living Encyclopedia), Luo have benefited from the beneficial food availability 
and protein situation, therefore showing significantly better mean height values 
for men in historical data compared to all other Kenyan ethnic groups (Moradi, 


2009). Given the long time span of settlement close to the Lake Victoria and the 
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historical data, one might speculate over emerging genetic differences among the 
Luo and other ethnic groups that lead to mean height advantages that manifest 
themselves already in nutritional indicators in early ages. Following the recent 
WHO study on the new child growth standard genetic factors seem to play a mi- 
nor role in explaining the disparities in physical growth among children (WHO, 
2006).3 Thirdly, fish consumption in Kenya is not only determined by availabil- 
ity aspects but as well by cultural habits. While Luo and some ethnic groups 
in the coastal area use fish as a staple food, it is viewed with considerable suspi- 
cion among ethnic groups in Central and Eastern Kenya (Oniang’o and Komokoti, 
1999; Peters and Niemeijer, 1987). 


With respect to differences in the position of women among Kenyan ethnic 
groups, no clear picture exists. While polygamy is common among most of 
Kenya’s ethnic groups widow inheritance is practiced primarily by the Luo and 
certain smaller clans among the Luhya ethnic group, therefore weakening the 
role of women in these ethnic communities. In contrast, female genital muti- 
lation is practiced by the majority of Kenya’s ethnic communities, while only 
Luo, Turkana, Luhya and Iteso do not (Chesoni, 2006). Moreover, female educa- 
tion levels tend to be high among Luo women compared to other ethnic groups 
(Wainaina, 2006). 

Furthermore, ethnic belonging plays a crucial role on the allocation process 
of public resources and political positions in Kenya due to the prevailing kinship 
structures and patron-client relationships (Cohen, 1995; Miguel, 2004; Miguel and 
Gugerty, 2005; Weinreb, 2001) and this way affects health indicators. Out of the 
more than 40 ethnic groups in Kenya, the Luo represent about 13% of the Kenyan 
population and constitute the third largest ethnic group in Kenya whereby only the 
Kikuyu with 23% and the Luhya with about 14% tend to have higher shares in the 
overall population. 


Although Luo ethnic groups took an important role in the independence pro- 
cess in Kenya, they have been politically under-represented at national political 
levels and except very recently not being part of any coalition since 1965. The 
Luo are the only major ethnic group in Kenya that has not been part of the na- 
tional government since this time span and this under-representation of Luo in- 
terests on the national level has resulted in a limited access to public funds from 
the national level and lead to a steady under-investment of health and schooling 
facilities in the Nyanza region (Alwy and Schech, 2004; Muhula, 2008; Nyan- 


3To further investigate this issue, we compared children’s mean stunting scores between Luo 
and other ethnicities outside the Lake Victoria Region and additionally outside Nyanza and West- 
ern province. Based on an oneway ANOVA, differences in height for age scores turned out to be 
statistically insignificant for this setting. Therefore, we conclude that genetic differences do not 


explain the observed growth differential for children in our context. 
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jom, 2006) compared to most other regions, the exception being the north east of 
Kenya. 

Bearing in mind the circumcision behavior of ethnic groups in Kenya, we 
would expect the highest HIV/AIDS prevalence among the Luo ethnic group. In 
addition, the situation of the political economy in Kenya is likely to further aggra- 
vate the mortality levels of the Luo due to a worse access to health care facilities 
compared to the main other ethnic groups in Kenya (Cutler et al., 2006). Since 
discriminatory practices in the allocation process of public resources probably oc- 
curs in practice on a provincial level meaning that relatively less money will go 
to Nyanza as a whole, there might be an unfavorable effect on mortality levels for 
all ethnic groups living in Nyanza. 


3.5 Empirical Findings 


3.5.1 Data 
The KDHS 2003 


In the empirical analysis we use data from the 2003 round of the Kenyan Demo- 
graphic and Health Survey (KDHS). The KDHS 2003 includes full birth history 
information from 4346 women of reproductive age that gave birth to at least one 
child in the five years preceding the 2003 KDHS survey. For the first time the 
KDHS includes data from all provinces of Kenya as well as data on HIV testing. 
Moreover, the survey is based on a two-stage survey design. In the first stage 400 
clusters were randomly chosen from a master frame. Afterwards, households were 
systematically sampled out of each cluster.* In every second household sampled, 
men, aged 15 to 54 years, were interviewed to conduct a Men’s questionnaire. All 
women and men living in households selected for the Men’s questionnaire were 
asked to voluntarily participate in the HIV testing. 76% of all eligible women 
voluntarily agreed to undergo the test.? 

In addition to the variables directly derived from the household questionnaires, 
we calculate the distance of each cluster to the shores of Lake Victoria using the 
GPS coordinates provided by ORC Macro. We define the Lake Victoria region 
as the area within a 20km boundary to the shores of the lake. Furthermore, we 
exploit the MARA (2004) database on endemic malaria to obtain district level 


^In the following we often refer to clusters as communities since in the DHS context it is a 
geographical unit, consisting of several households. 

>The official KDHS 2003 report provides several descriptive and multivariate examinations on 
whether non-participation in HIV testing is systematically related to other variables. No systematic 
relationship was found (CBS, 2004) and therefore we expect our results not to be effected by 


sample selection bias when using the reduced HIV sample. 
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information on malaria prevalence in Kenya. Unfortunately, we could not obtain 
data on the health care sector in Kenya on the district level. Instead, we rely 
on data published in Nyanjom (2006) who reports information on the number of 
people per medical officer and public health expenditures per capita on provincial 
level for the time period 1995 - 1998. 


The samples 


As is common in the literature we include only those children who were born 
within the 5 years preceding the survey. Since hygienic and socio-economic con- 
ditions are less likely to have changed over the course of 5 years compared to 10 
or more years, this decision improves the accuracy of the matching of the covari- 
ates to the actual survival time in the multivariate analysis of under-5 mortality. 
Moreover, data on children's height and weight has only been collected for those 
children being below the age of 5 at the time of the survey therefore restricting the 
information on the nutritional status of children to the same period of time. 

The mortality sample consists of 1368 mothers reporting 2697 births in the 
last five years. 605 of these children died. The respective undernutrition sam- 
ple remains with 1218 (1217) mothers who reported data on 1704 (1701) living 
children at the time of the survey for the stunting (wasting) regressions." 


Variables of Interest 


The selection of variables for descriptive statistics and the undernutrition and mor- 
tality regressions is guided by the frameworks outlined in the previous section and 
the discussion of the role of ethnical, political and geographical factors in section 
3.4 of this paper. An overview of variables used in this article including its coding 
is provided in Table 3.3. 


The same data base from http://www.mara.org.za/ was used in Oster (2007) to calculate re- 
gional malaria prevalence rates. Moreover, at the country level these malaria measures are closely 
correlated with climate-determined malaria susceptibility, as used in Sachs and Malaney (2002). 


"Data on childrens’ height and weight was missing for 11% of all living children. 
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Two variables deserve particular attention. First of all, as mentioned earlier, 
we take the HIV status of the mother into account. The interpretation of this 
variable in the regression context is not straightforward. Since the HIV status of 
children has not been collected, it remains unclear whether the AIDS virus has 
been transmitted to the child at all during pregnancy or breastfeeding period and 
whether the mother already had the virus at the time of the birth of the child. In- 
corporating the HIV status of the mother in the regressions is therefore likely to 
yield a downward biased coefficient with a lower significance level. Moreover, 
the HIV status of the mother does not only measure a direct epidemiological ef- 
fect on children but as well a socioeconomic one. In particular, children in a HIV 
affected household might suffer from diminishing capacities of their main care- 
givers to purchase certain key inputs for the children due to a loss of household 
income as a result from the disease. Furthermore, as described in section 3.4, the 
HIV status of a parent is partly related to cultural practices, e.g. male circumcision 
and therefore inhibits cultural elements as well. Secondly, the distance of a cluster 
to Lake Victoria plays an important role in our study. As pointed out in section 
3.4 we would expect to see much better stunting values in close proximity to Lake 
Victoria due to the availability of fish and other food over the whole period of the 
year while with respect to wasting we would assume constantly improving wast- 
ing rates the closer Lake Victoria. In contrast, we would infer under-5 mortality 
levels to substantially deteriorate in close proximity to Lake Victoria due to the 
much higher disease environment in this area. Since most of our health environ- 
mental and geographical variables are either only on provincial or district level 
and moreover might not be free of measurement error, we would still expect to 
have an effect on our distance variables. In order to measure these effects appro- 
priately we include a dummy variable indicating whether a household lives in the 
Lake Victoria region, within a distance of 20km of Lake Victoria, in the stunting 
and under-5 mortality regressions while in the wasting regression the distance to 
Lake Victoria is incorporated as a continuous variable. 


From the economic literature (Mosley and Chen, 1984; Smith and Haddad, 
2002) as well as from our theoretical framework, it becomes clear that the vari- 
ables described in Table 3.3 are important to study the determinants of undernu- 
trition as well as the context of under-5 mortality. Thus, we use the same list 
of covariates in the multivariate analysis of undernutrition and mortality.'? The 
KDHS includes some further variables, e.g. information on children’s protein in- 
take or pre- and post natal care, which are only used for descriptive purposes since 


The final model specifications include squared terms whenever the respective coefficient 
showed a statistically significant value. Otherwise squared terms were excluded. In this regard 


model specifications might differ between the stunting, wasting and mortality regressions. 
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these variables exhibit a very large number of missing observations leading to a 
too strong reduction in the sample size for the multivariate analysis. 


3.5.2 Descriptive Statistics 


Summary statistics on the variables used in this study are provided in Table 3.4. 
Moreover, we distinguish in Table 3.4 between different geographical and ethnical 
specifications. Column 1 depicts statistics based on the Lake Victoria region, the 
area within a 20km distance of the shores of the lake, while column 2 provides in- 
formation based on overall Kenya except the Lake Victoria region. Column 3 and 
4 refer exclusively to the Lake Victoria region. In column 3 summary statistics are 
provided for the Luo ethnicity while statistics on the remaining ethnic groups in 
the area are shown in column 4. In addition, columns 5-12 comprise information 
on the same set of variables for overall Kenya and for each of the eight Kenyan 
provinces. 

The first 3 lines of Table 3.4 already demonstrate the distinct setting of the 
Lake Victoria region in Kenya in terms of child malnutrition and mortality out- 
comes. As discussed in section 3.4, average rates in stunting and wasting are 
far below average in this region. Quite the contrary, under-5 mortality is by far 
the highest in Nyanza, reaching its peak in the Lake Victoria region. Bearing in 
mind that the underlying causes of malnutrition and mortality may differ, we look 
separately at geographical and ethnic disparities in the variables of interest. 
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Malnutrition 


The results from Table 3.4 show that the dietary intake is much higher in the Lake 
Victoria region than in the rest of the country, suggesting higher food availabil- 
ity at the shores of Lake Victoria. The observed higher than average intake of 
local grains and vitamin A rich fruits, like mango or papaya, can be attributed to 
the fertile soil found in the lake basin in combination with enough rainfall, facil- 
itating a large supply of these aliments. Moreover, 38% of Luo mothers in the 
Lake Victoria region allocate protein rich food to their children at least once a 
week, compared to only 26% of mothers belonging to other ethnic groups in the 
region and to a national average of 22%. This remarkable difference between Luo 
and other ethnic groups within the Lake Victoria region might partly be explained 
by different food preferences as outlined in section 3.4.3 whereby other ethnic 
groups do not use fish as a staple crop. Figures 3.4 and 3.5 suggest more favor- 
able agricultural conditions the closer the lake (FEWS-Net, 2004), substantiating 
the finding of highly cultivable soil near Lake Victoria. In the DHS, data on pro- 
tein intake is not further disaggregated into its share of fish, meat or eggs. We 
use secondary data to stress the relevance and availability of fish. Data from the 
Kenya Integrated Household Budget Survey (KIHBS) 2005/2006 shows that fish 
consumption is highest in Nyanza province. In addition, households in Nyanza 
seem to spend 6.1% of their budget for food on fish compared to a national aver- 
age of 2.1% (Kenya National Bureau of Statistics, 2006). 


Mortality 


The high mortality rates in the Lake Victoria region point to the existence of ex- 
traordinary factors that help to explain the observed outcomes. 

Table 3.4 depicts considerably higher malaria, HIV and diarrhea prevalence in 
the Lake Victoria region than in all other parts of Kenya. This result confirms the 
findings from the literature review in section 3.4. Moreover, high malaria rates are 
not only confined to the Lake Victoria region and Nyanza but to a lesser extent as 
well to Western province which might be indebted to some of its area bordering 
Lake Victoria. Furthermore, the incidence of diarrhea in the Lake Victoria region 
(21.7%) is clearly above the Kenyan average and increases about 4% compared 
to the Nyanza average which might indicate the poor quality of drinking water of 
Lake Victoria and in its connected open waters. Further on, HIV rates in Nyanza 
and in particular in the Lake Victoria region are much higher than in any other 
area of the country which seems to be partly due to its comparatively high level of 
urbanization and its extraordinary position as a traffic hub between Tanzania and 


Central Kenya. 
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Geography plays an important role on under-5 mortality. Ethnicity may be 
as important. The Lake Victoria region is — besides its unfavorable disease en- 
vironment — characterized by being predominantly populated by the Luo ethnic 
group. The Luo represent 81% of the total Lake Victoria population and 94% of 
the Lake Victoria population in the province of Nyanza. The small upper north- 
ern part of the Kenyan Lake Victoria region belongs to Western province and is 
mainly populated by the Luhya ethnic group who also represent almost the entire 
remaining population of Nyanza (17%). Distinguishing between ethnic groups 
within the Lake Victoria region, enables us to disentangle the effect of geograph- 
ical and cultural factors. The Luo exhibit significantly worse outcomes in several 
proximate factors of child mortality than the other ethnic groups around Lake Vic- 
toria. 26% of all Luo mothers are tested HIV positive. Compared to a national 
average of 8.8% and to an average of 10.6% for the remaining ethnic groups in 
the Lake Victoria region, the result clearly points at a strong relationship between 
cultural habitus of the Luo and HIV infection, as already described in section 3.4. 
Moreover, the size of the differences between ethnic groups in the Lake Victoria 
region astonishes. While circumcision practices have been pointed out as one of 
the potential reasons for higher HIV rates among the Luo, the difference in most 
DHS reports for SSA countries between circumcised and uncircumcised adults 
was substantially lower and usually amounts to 4% - 7%. 

A similar picture emerges from maternal factors and the pre and post natal be- 
havior of the Luo. Luo mothers seem to start much earlier with bearing children 
than Luhya and most other ethnic groups. In addition, birth intervals between 
consecutive children are considerably shorter reflecting higher total fertility rates 
among Luo than among any other ethnicity (CBS, 2004). Moreover, short pre- 
ceding birth intervals are often caused by the death of the previous child. Short 
succeeding birth intervals result in termination of breastfeeding. Indeed, both in- 
dicators, previous dead child and breastfeeding, are found to be especially under 
performing for Luo. Furthermore, the average number of child deliveries and 
caesarean sections as well as pre-birth visits in official health centers all show 
the lowest value for Luo. Besides of cultural habits of the Luo, these outcomes 
may also point to discriminatory practices against the Luo from the national level 
resulting in limited access to public funds and hence to lower public health expen- 
ditures. Such an interpretation is supported by the data on the Kenyan health care 
sector obtained from Nyanjom (2006). This data indicates that Nyanza receives 
the lowest amount of public health expenditures per capita and further on has the 
highest ratio of inhabitants per physician among all Kenyan provinces. 

The descriptive findings are in line with the considerations undertaken in sec- 
tion 3.4. Thus, geographical, cultural and political factors seem to contribute 
jointly to the high mortality rates in the Lake Victoria region compared to the rest 


of Kenya. Due to the simultaneous occurrence of all three of these factors in the 
Johannes Grab - 978-3-653-00480-9 


Downloaded from PubFactory at 01/11/2019 02:31:01AM 
via free access 


3.5. EMPIRICAL FINDINGS 95 


Lake Victoria region, it is difficult to establish the influence of a certain factor on 
our anthropometric and mortality outcomes when relying on bivariate statistics 
and analysis. Therefore in the following section, we will use regression analysis 
to examine causal relationships going from the observed covariates to the malnu- 
trition and mortality outcomes, thereby putting a special emphasis on geographic 
and cultural factors. 


3.5.3 Method 


In order to investigate the determinants of undernutrition we rely on a linear re- 
gression model, while in the context of under-5 mortality we use a (non linear) 
Cox proportional hazard model. We will use multilevel extensions of the respec- 
tive models for various reasons. Multilevel modeling allows for efficient and, for 
non linear models, consistent estimation in the case of significant intragroup clus- 
tering. Beyond, we will particularly exploit the variance decomposition, inherent 
in multilevel modeling. As we will see later on, the modeling helps in demon- 
strating the relevance of the family and community environment for individual 
outcomes and also to assess the contribution of observed covariates on the within- 
level variation. 

The survey design of the KDHS 2003 involves hierarchical collection of data 
at the family and community level which results in clustering of undernutrition 
and mortality outcomes. In regression analysis, clustering is problematic if it is 
not only due to observed but also to unobserved household and community fac- 
tors. For instance, net of observed factors, people within the same community 
are more alike than people across other communities since they are likely to share 
similar latent characteristics. Therefore, the statistical assumption of indepen- 
dence of the error term is violated and as a consequence confidence intervals are 
underestimated leading to false statistical inference. Typical unobserved factors 
at the family level are shared genetic factors, social practices or the pre and post 
natal behavior of the mother (Bolstad and Manda, 2001). Likewise, shared envi- 
ronmental factors may lead to clustering at the community level.? To illustrate 
the extent of clustering in the KDHS 2003 data, we focus on the observed under-5 
mortality outcomes. In our sample for all of Kenya 2915 (67%) of the 4346 in- 
terviewed women did not experience any child deaths, while 941 (21.7%) women 
had to suffer from the death of exactly one child. Just 184 (4.2%) women expe- 
rienced three or more child deaths in the five years preceding the survey. These 
4.2% account for more than 30% of all deaths, showing a substantial amount of 
correlated outcomes and clustering within families. A similar pattern arises at the 


'3See section 3.3 or Sastry (1997a) for a detailed overview over potential unobserved family 


and community characteristics. 
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community level. 62% of the 400 clusters under consideration contribute to 26% 
of all dead children, while on the other hand 21% of the communities account for 
more than 50% of all child deaths. 

Due to the large clustering of outcomes on the mother as well as on the 
community level, we use multilevel mixed effects variations in the respective re- 
gression models. We use three-level models, controlling for correlated outcomes 
among siblings, i.e. on the household level, and among communities.!^ In these 
models the error term is decomposed into a single error term on each level, captur- 
ing unobserved heterogeneity at each level. We will especially exploit the option 
of variance decomposition to measure the explanatory power of the covariates on 
the between family and community variation and, even more, to shed light on the 
specific contribution of each variable on the overall variation of malnutrition and 
mortality outcomes across families. 15 


The linear multilevel model for malnutrition 


To analyze the driving factors of malnutrition, we use a linear multilevel regres- 
sion model. As mentioned above multilevel models are applied to control for 
clustering caused by unobserved heterogeneity and to increase the precision of 
the estimated coefficients of the covariates. For an introduction into linear multi- 
level models see section 2.3.2. 

Since we control for family as well as for community specific effects, our three 
level random intercept model reads: 


эк = Qr Віхи) + (vk t uc к), (3.3) 


with i = 1,...,/ individuals, j = 1,...,J households and k = 1,..., К commu- 
nities. Uj, is the household random effect, у; the community random effect. Y 
is a vector of stunting and wasting outcomes, respectively and X is a vector of 
observed covariates. 

The models are estimated using the “xtmixed” command implemented in Stata 
(Stata, 2007). 


The multilevel Cox frailty model for under-5 mortality 


Cox proportional hazard models, proposed by Cox (1972), are the standard models 
used in child mortality analysis (Cox and Oakes, 1984; Cameron and Trivedi, 


Using a two-level model by neglecting correlations on the community level would lead to an 
overestimation of the family random effect. 

15 should be noted that multilevel modeling may not control for omitted variable bias in case 
of linear models. Fixed effects models would be required. However, panel data on the respective 


household information is not available and panel estimation therefore infeasible. 
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2005). In a proportional hazard model, the hazard rate is the instantaneous risk of 
death in t conditional on survival up to f: 


A(t\x) = Ao(t)exp(x B). (3.4) 


Ao is the baseline hazard only depending on т. On the contrary, exp(x В) 
depends only on x. The idea is that all hazard functions are proportional to the 
baseline hazard, just being shifted by the scale factor exp(x'B). A(t|x) is the 
hazard of child death at time t given x. The advantage of the semi-parametric Cox 
proportional hazard model over parametric models is that no functional form of 
the underlying hazard function has to be assumed. 

The multivariate kindred frailty model has been developed by Vaupel (1989, 
1990). The two-level model was applied to study the effect of unobserved shared 
family characteristics on survival. Sastry (1997b) and later on Bolstad and Manda 
(2001) in a full Bayesian approach and Pankratz et al. (2005) extended the model 
to the multilevel case.! In a standard frailty model, a frailty, z, is an unobserved 
random effect which works multiplicatively on the hazard function. 


h(t|x,z) = zA(t|x). (3.5) 


In economic survival analysis, the frailty is usually referred to as a shared 
frailty since it is a random effect which is the same for all members of a group, 
for example a family or a community effect (Anderson et al., 2007). Transferred 
to the three level model with unobserved frailty on the household and community 
level, the hazard function reads: 


hig.t | И jks vk) = и кува (жи), (3.6) 


with individuals i = 1,...,7, j = 1,...,J households and К = 1,..., К communi- 
ties. Ид is the household random effect, v; the community random effect. The 
individual (child) frailty is absorbed in the baseline hazard. The unobserved frailty 
is assumed to be independently distributed of all covariates and to follow a Gamma 
distribution with mean 1 and variance 0 (Sastry, 1997b). 

The family frailty effect measures the variation in family specific exposure to 
risk across families after controlling for observed variables. Children of families 
with a large frailty have, ceteris paribus, a larger risk of dying. This excess risk 
would be triggered by unobserved behavioral or genetic family specific factors. 
Insignificant variation of the family frailty effect would mean that there are no un- 
observed family specific characteristics. In this case, survival chances of siblings 
would be uncorrelated. 

The model is estimated using the “coxme” command of the kinship package 
in R (Therneau, 2006). 


16Guo and Zhao (2000) give а good overview over multilevel modeling for binary data. 
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3.5.4 Regression Results 


In all of our regressions, we start with a conventional single level regression spec- 
ification, called Model I or the 'Standard Model’. In Model II a family random 
(frailty) effect is incorporated into the model. Finally, we add in Model III a com- 
munity random (frailty) effect. Likelihood ratio tests are applied to test for sig- 
nificant random effects in the linear as well as in the non linear model (Goldstein, 
2003). Since Model III performs - based on the likelihood ratio tests - significantly 
better than model II and I for all of our analyses, we will discuss our regression 
results based on this model. We will, however, compare the results of Model III 
with the inferior models to justify multilevel modeling. 


Undernutrition 


Results of stunting and wasting regressions are shown in Table 3.5 and 3.6, re- 
spectively. 


Stunting 


In Model III, stunting values are significantly better for girls than for boys which 
has been a common finding in the literature on SSA countries (Svedberg, 1990; 
Klasen, 1996; Wamani et al., 2007). Being a twin shows a significant negative 
effect which seems reasonable given the higher nutrition competition and the cir- 
cumstance that twins are smaller and lighter at birth compared to single births. 
Children of better nourished mothers, as indicated by a higher body mass index, 
have, ceteris paribus, a higher height for age score. Moreover, the higher wealth, 
educational attainment of the mother, and the better the access to water, the better 
the stunting values of a child. On the other hand, the age of women at birth as 
well as living in rural areas, or having access to better sanitation facilities does not 
seem to have an effect!" 18 

Interestingly, a child’s disease status (hiv, diarrhea, malaria) does not have a 
significant impact on a child’s height for age status. Diseases have a rather short- 
term impact on children’s health outcomes. The long-term indicator stunting may 
therefore fail in displaying any significant negative relationship. 


17 The significant positive coefficient of the variable "people per physician" is most likely driven 
by the tremendous share of people per physician in the North Eastern province which exhibits, 
however, very good stunting outcomes. 

'8Once again, standard errors of all estimates mentioned above slightly increase from Model 
I to Model Ш. The significance of the variables, first born child, birth interval and Luo vanishes 
when controlling for unobserved heterogeneity in the multilevel specifications. Neglecting highly 
correlated outcomes of children within families and communities leads to false statistical infer- 


ence, justifying the usage of multilevel models. 
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Children residing in close distance to Lake Victoria are showing significantly 
more favorable height for age outcomes. As pointed out in sections 3.2 & 3.5.2 the 
high availability of fish in combination with very advantageous agricultural condi- 
tions are likely to be the main reason for this puzzling result and the extraordinary 
low prevalence of stunted children in the region. 

These results endorse, once again, our afore mentioned theoretical considera- 
tions. Adverse geographic factors in the Lake Victoria region - malaria suitability, 
HIV endemicity and possibly unsafe drinking water - may affect the excessive 
risk of death for children in the region. They do, however, not affect a child’s 
stunting status. Quite the opposite, the region around Lake Victoria exerts pos- 
itive influences on children’s nutritional outcomes. Fertile soils, a high level of 
food security and high protein availability (fish) foster the growth process of chil- 
dren. This result does not question the clear epidemiological relationship between 
a child’s disease status and its mortality outcome. But the findings challenge the 
adequacy of inferring children’s health and mortality outcomes from children’s 
height status. Environmental factors - as found in the Lake Victoria region - can 
substantially modify the relationship between children’s growth and health status. 


Wasting 


In contrast to stunting, wasting outcomes of children are much stronger affected 
by short term factors and, hence, are subject to much larger fluctuations. A poor 
wasting status is often a consequence of having recently suffered from illness or 
from insufficient food intake (Fawzi et al., 1997; UNICEF, 2008). As a conse- 
quence of these short term variations and missing information on important factors 
wasting regressions are usually not considered by demographers and economists. 
Technically speaking, a lot of the variation in wasting indicators among children is 
due to short-term variation, e.g. whether a child got sick or not, which can not be 
captured appropriately by the set of variables obtained from common household 
surveys. Therefore, determinants of anthropometric outcomes in wasting regres- 
sions are much more difficult to establish than in stunting regressions. Nonethe- 
less, section 3.4.1 provides a priori reasons to believe that there are very crucial 
geographical factors at play that should be able to be verified even in the multivari- 
ate case. In particular, the observation that food insecurity constantly increases 
throughout the country, the further away Lake Victoria, demands to be investi- 
gated. Instead of using a dummy variable referring to the Lake Victoria region, 
we introduce a continues variable measuring the distance to the Lake Victoria in 
km. Regression results are presented in Table 3.6. 
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As expected, most of the variables fail in explaining wasting differentials. In 
our final model, Model III, coefficients of mother’s educational status, the water 
supply index and the first child dummy have the expected sign but are slightly 
insignificant. Just four variables appear significant. Mother’s BMI, itself an in- 
dicator of nutritional status, is positively affiliated with children’s weight. The 
quality of the sanitation facility effects positively the weight-for-age status. 

Once again, geographic and political factors seem to play a major role in deter- 
mining nutritional and anthropometric outcomes. The lower a province’s ratio of 
people per physician, i.e. the easier the access to medical care in that province, the 
higher the children’s wasting score. The closer Lake Victoria, the higher average 
z-values of wasting.!? 


Mortality 


Results of the under-5 mortality regressions are presented in Table 3.7.29 Once 
again, Model I is the standard single level regression model. Model II incorporates 
a household level random effect. In Model Ш a community effect is included in 
addition to the household effect. Regression results will again be reported based 
on the statistically superior model, Model III. 


19 As explained in section 3.4.1 we would expect the distance effect to vary continuously with 
respect to wasting outcomes. Robustness checks which are not reported here confirm that distance 
to Lake Victoria only seems to matter when captured continuously. In the full model the distance 
effect when included as a dummy variable always turned out to be statistically insignificant. 

20Hazard rates, the probability of death in t conditional on survival up to f, are to be interpreted 


in relation to 1. Thus, a hazard ratio of 1.2 implies a 20% higher risk of death 
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In Model III, coefficients on all variables that are commonly included in under- 
5 mortality regressions show a reasonable economic size, take the expected signs 
and are in line with the empirical literature. The hazard rate is approximately 20% 
lower for girls than for boys. Being a twin increases the probability of death by 
50%. The location of the family, in terms of rural or urban, does not have a sig- 
nificant impact on survival chances. Interestingly, the same is true for the body 
mass index of the mother. Children stemming from better nourished mothers do 
not exhibit a lower risk of death. Children from richer households do, ceteris 
paribus, not seem to have a lower probability of death.?! The educational back- 
ground of a mother evolves as highly significant, showing that children’s survival 
chances are higher, the better educated the mother. Neither the quality of the wa- 
ter source nor the hygienic status of the household seems to have a considerable 
impact on under-5 mortality in Kenya. Moreover, a higher age of the mother at 
birth, a longer preceding birth interval and being the first born child in a family, 
increases a child’s survival probability. Significant squared terms with reversed 
signs implicate diminishing returns of the factors. 

Before examining the context specific determinants of under-5 mortality, it is 
useful to briefly discuss the benefits obtained from the adopted multilevel mod- 
eling strategy in this particular case. Comparing the highly significant family 
random effect of 0.60 in Model II with the family random effect of the unreported 
regression model without covariates but only the family random effect, of 0.90, al- 
lows to conclude that the observed covariates were capable of explaining approx- 
imately 33% of the variation in child deaths across families. The family random 
effect of 0.60 is, however, likely to pick up unobserved heterogeneity on the com- 
munity level; meaning that variation in child mortality across families could partly 
be explained by disparities in unobserved characteristics across communities. In- 
deed, the family random effect decreases by another 30% to 0.42 in Model III. The 
significant unobserved heterogeneity on the community level of 0.16 might be due 
to differences in unobserved child or family factors across communities. Since we 
included a large set of variables on both levels, we are, however, interpreting this 
effect as stemming from unobserved community, i.e. geographic or infrastructure, 
characteristics, that play a major role in explaining differences in mortality rates 
across Kenya. As expected, after the inclusion of family and community random 
effects standard errors of almost all covariates increase, lowering their statistical 
significance. Moreover, we do not only observe changes in the standard errors, 
but as well differences when comparing the hazard estimates of the models. Con- 
sistent with previous literature on multilevel frailty models (Omariba et al., 2007; 
Sastry, 1997a), effects of socioeconomic household level variables (assets, edu- 


?'This missing significance might to some extent result from the high correlation (0.47) of the 


asset index with the educational status of the mother. 
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cation) increase, while individual risk variables (birth order, previous child died) 
decrease. Especially striking is the decreasing effect of the variable that indicates 
whether the previously born child is still alive. This variable, highly correlated 
among siblings, is a clear indictor of families with higher mortality risk. The 
higher the unobserved frailty effect, the higher the probability that a sibling has 
already died. Neglecting the frailty leads to an overestimation of this effect. Con- 
trary, factors independent of the unobserved frailty and among siblings, e.g the 
sex of the child, or the age of the mother, show only slight changes.” 

Taking into account the specific cultural, geographic and political context we 
included variables on ethnicity, HIV-status, malaria and diarrhea prevalence, the 
Lake Victoria region, the number of people per physician and per capita health 
expenditures. Given the remarkable correlation of these factors, a simultaneous 
inclusion seems crucial to get unbiased estimates. All of these variables turn out 
to be significant. 

Even after the inclusion of behavioral characteristics, being born to a Luo 
mother seems to have an adverse effect on survival chances. Controlling for ge- 
ographical factors and the HIV-status, this result strongly points to an influence 
of adverse cultural practices (pre and post natal behavior), which were either un- 
observed or could not be considered in the regression. The potential influence 
of the latter has been illustrated in section 3.5.2 by the low share of Luo bearing 
their child in official health centers and the low vaccination rates of Luo's chil- 
dren signifying the awkward pre and post natal behavior of Luo. Omariba et al. 
(2007) undergo a similar study on child mortality in Kenya neglecting, however, 
HIV-infection and geographic factors. They conclude an excess risk of 10.5 for 
Luo children compared to children from Kikuyu?? mothers. Their much higher 
coefficient suggests an exaggeration of the Luo effect based on missing but highly 
correlated variables in their study. 

The HIV status of the mother has severe negative impact on the survival status 
of a child. We estimate a hazard rate of around 1.5 bearing in mind that the 
coefficient is likely to be underestimated.?^ 

Based on the results in Model III, the excess risk of living in a high risk malaria 
area is around 35% and therefore we confirm that malaria prevalence has a strong 


Z'Underestimation of socioeconomic variables is a standard result in hazard models that are 
not controlling for unobserved heterogeneity. The mathematical foundation can be in found Guo 
and Rodriguez (1992) 

?3Living primarily in Nairobi, Central province and the Rift Valley, the Kikuyu represent the 
largest ethnic group in Kenya. 

4In a longitudinal survey for rural Tanzania, Ng’weshemi et al. (2003) report a child death 
hazard ratio for maternal HIV infection of 2.3. Zaba et al. (2005) find in a cohort study for Uganda, 
Tanzania, and Malawi a hazard rate of 2.9. In a retrospective panel data study for Burkina Faso, 
Becher et al. (2004) estimate the hazard rate of mother's survival status to be 5.4 for children aged 


1-5. 
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impact on mortality rates. Since to our knowledge no studies exist that tried 
to explicitly capture malaria prevalence as a covariate in a comparable setting 
we are not able to compare the magnitude of the coefficient to other studies di- 
rectly. Nonetheless, recent studies from the medical literature (Snow et al., 1998; 
Omumbo et al., 2004; Ndugwa et al., 2008) commonly underscore the still mas- 
sive impact of malaria on mortality in Kenya and Sub-Saharan Africa as well. 

Besides malaria and HIV, diarrhea is presumed to be one of the most important 
drivers of child mortality. Our estimation suggests that living in a high risk diar- 
rhea community almost doubles the excess risk of dying. This effect is possibly 
triggered by poor water quality. Even though we controlled for the type of water 
source, no direct indicator of water quality was available. 

The negative effects of limited access to public funds are eminent. Children re- 
siding in provinces that receive lower public health expenditures and suffer under 
a higher share of people per physician exhibit significantly higher risk exposure. 
Political discrimination seems to be an important factor of the spatial variation in 
under-5 mortality rates. 

Despite controlling for geography, ethnicity and political outcomes, we still 
obtain a positive and significant coefficient for the Lake Victoria region on under- 
5 mortality. This demonstrates the unique misanthropic environment of the region, 
which could not be sufficiently captured by the variables at hand.”> The positive 
dummy might also pick up the natural malaria barrier of the East African rift 
valley - arising only some kilometers away from Lake Victoria - reducing the risk 
of malaria drastically. 

The regression results confirm the theoretical considerations and descriptive 
findings of sections 3.4 and 3.5.2, respectively. The vast mortality rates of the 
Lake Victoria region rest upon a simultaneous impact of unfavorable geographic, 
cultural and political factors. There might be other regions in Sub Saharan Africa 
showing severe water pollution, elevated climate suitability for malaria transmis- 
sion and high susceptibility for HIV infection. In the case of the Lake Victoria 
region of Kenya, these adverse geographic conditions are, however, found in a 
territory which is not only primarily populated by an ethnicity demonstrating ad- 
verse pre and post natal behavior but which is also suffering from political dis- 
crimination leading to underdeveloped access to health infrastructure. 


Family clustering decomposition 


The regression analyses conducted above provided insights on causal effects of 
potential individual malnutrition and mortality drivers. From a policy perspec- 


?5Recall, that the risk exposure to diarrhea and malaria was measured on the community and 
district level, respectively. Moreover, data on public health expenditures was available on the 


provincial level only. 
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tive it is of interest to Know those covariates that explain most of the variation in 
malnutrition and mortality outcomes across families or communities. Which co- 
variates link children’s risk of malnutrition or children’s survival chances within 
a family or community? Gaining knowledge of these key variables allows policy 
makers to improve the targeting performance of poverty alleviation programmes. 

An insight to this question can be derived by observing the variation of the 
family or community random effect (frailty) when each covariate is included ex- 
clusively in the null model or omitted exclusively from the full model. If the 
specific variable accounts for a large part of the overall variation, the variance of 
the random effect should decrease substantially when included in the null model 
and increase substantially when omitted from the full model. We will conduct this 
test for each variable for the stunting, wasting and mortality regression. To keep 
things clear we will limit this analysis for the clustering of malnutrition and mor- 
tality outcomes within families. Therefore we will use the Model II specification. 

The results are shown in Table 3.8. The first line depicts the variation of 
the family random effect when no covariate is included (null model), and when 
all covariates are included (full model). The results indicate that the entire set of 
covariates accounts for 27%, 39% and 34% of the amount of family clustering, i.e. 
the between family variation, for the stunting, wasting and mortality regression, 
respectively. 

Concerning the stunting regression, a mother’s HIV and children’s diarrhea 
status account for the largest proportion of the between family variation in stunt- 
ing outcomes. This result holds for the inclusion of the variables in the null model 
and their exclusion from the full model. The same variables appear to be largely 
responsible for the family level clustering of wasting outcomes. The BMI of the 
mother emerges as another important factor for the between family variation in 
wasting rates. As for the regression results, a different pattern emerges when con- 
sidering the under-5 mortality within family clustering. Mother’s education, the 
survival status of the previous child and being the child of a Luo mother matters 
the most when explaining differences in mortality outcomes between families in 
Kenya. 

These results should not be confounded with the regression results of Tables 
3.5 - 3.7. For example, being a female or living around Lake Victoria has a signif- 
icant impact on a child’s nutritional and mortality outcome. Those variables are, 
however, better suited to explain the between sibling and between region variation 
respectively, than the between family variation. 
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3.6 Conclusion 


Kenya’s Lake Victoria region is marked by an interesting puzzle. Under-5 mor- 
tality rates are by far the highest in the country, while at the same time anthro- 
pometric indicators of children show remarkable good values. The extent of this 
abnormity becomes even more astonishing when comparing the Lake Victoria 
area to other regions in Sub-Saharan Africa. Nowhere else in the whole of SSA 
we find such a strong disconnection of anthropometric and mortality outcomes. 

In order to examine and understand the causes of this unusual phenomenon 
we undertake the uncommon step to analyze the determinants of mortality and 
undernutrition jointly. Moreover, to reduce the likelihood of obtaining biased 
and inefficient estimates in our multivariate regressions we construct a new set of 
context specific variables that supplements the conventional DHS data in addition 
to the application of suitable multilevel modeling techniques. 

Our findings point to a unique interplay of cultural, geographical and political 
factors in the Lake Victoria region which are responsible for causing the described 
paradox. Concerning the under-5 mortality pattern in Kenya and around Lake Vic- 
toria we find that a salient disease environment characterized by extremely high 
malaria prevalence, polluted water sources and high rates of infectious diseases 
like HIV/AIDS is one of the key drivers of the massive under-5 mortality rates 
in the lake region. Furthermore, we see that even after controlling for mother’s 
age at birth, birth spacing, birth order and HIV-status an ethnic specific effect 
remains. Being born to a Luo mother affects survival chances adversely, most 
likely based upon unfavorable unobserved pre and post natal behavior. Political 
discrimination does also seem to be an important factor of the spatial variation in 
under-5 mortality rates. Children residing in provinces that receive lower public 
health expenditures - such as Nyanza province - exhibit significantly higher risk 
exposure. In addition, the results indicate that even after inclusion of a rich set of 
covariates and controlling for clustering in unobserved characteristics, we are still 
confronted with an unusual high mortality rate in the Lake Victoria region that 
remains unexplained by the covariates and that is most likely to be attributed to 
insufficiently captured geographical and political factors. 

A similar interplay of geographic conditions and cultural factors is found to 
constitute the extremely low incidence of stunting and wasting in the Lake Vic- 
toria region. While fish consumption in combination with an overall food secure 
situation spurs the growth process of children close to the lake and therefore leads 
to the much higher body height of children in the Lake Victoria area, the food 
security situation per se leads to ceteris paribus better wasting rates in the area. 

Although these results are already very important for policy making we fur- 
ther examined which single factors contribute most to explaining differences in 


malnutrition and mortality between Kenyan families. Our analysis reveals that the 
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hiv status of the mother and children’s diarrhea status explain the largest part in 
the variation of stunting outcomes between families while the educational attain- 
ment of the mother, the survival status of the previous child and being a Luo turn 
out to be the most important sources in explaining mortality differentials between 
families. 

Our findings demonstrate the relevance of considering and understanding the 
country specific context, when data on child mortality and malnutrition is ana- 
lyzed. We do not challenge the epidemiological literature in the sense that we do 
not question that on the individual level a causal relationship between nutritional 
and mortality outcomes exists. The analysis raises a serious concern when using 
children’s height status as a reliable proxy for health or income. This is only ad- 
visable when geographic, cultural and political contexts are comparable and this 
is often unlikely to be the case in cross-country or cross-regional analysis. 
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Table A.9: Regional Codes 


Code 


об чох \л Б о мо н 


Region 

Hauts Bassins 
Boucle de Mouhoun 
Sahel 

Est 

Sud-Ouest 

Centre Nord 

Centre Ouest 
Plateau Central 
Nord 

Centre Est 

Centre (Ouagadougou) 
Cascades 

Centre Sud 


Source: ORC (2004) 
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Figure A.8: BLUP of HH Size - 1994 Figure A.9: BLUP of HH Size - 1998 
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Figure А.10: BLUP of HH Size - 2003 Figure А.11: Youth per Adult - 1994 
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Figure A.12: Children per Adult - 1998 Figure A.13: Youth per Adult - 2003 
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Figure A.14: Education - 1994 
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Figure A.16: Education - 2003 
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Figure A.15: Education - 1998 
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